Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Awk is incredibly useful. I wrote a script this week to parse Postgres logs (many, many GB) to answer the question, "what were the top users making queries in the first few minutes at the top of every hour?" [0] Took a couple of functions, maybe 20 LOC in total, plus some pipes through sort and uniq [1]. Also quite fast, especially if you prefix it with LC_ALL=C.

[0]: If you're wondering why there wasn't adequate observability in place to not have to do this, you're not wrong, but sometimes you must live with the reality you have.

[1]: Yes, I know gawk can do a unique sort while building the list. It was late into an incident and I was tired, and | sort | uniq -c | sort -rn is a lot easier to remember.

[1].a: Yes, I know sort has a -u arg. It doesn't provide a count, and its unique function is also quite a bit slower than uniq's implementation.



you can do that in a lot lesser line of code in python and much better performance.


I suspect the performance part is only true if you're familiar with Python's stdlib performance quirks, like how `datetime.fromisoformat()` is orders of magnitude faster than `datetime.strptime()` (which would likely be the function you'd reach for if not familiar), or at the very least, that directly casting slices of the string into ints is in between the two. This example is parsing a file of 10,000,000 ISO8601 datetimes, then counting those between `HH:00:SS – HH:02:SS` inclusive. The count method is the same, and likely benefitting some OS caching, but the parse times remained constant even with repeated runs.

    $ python3 times_python.py strptime_listcomp
    parse time: 45.96 seconds
    count time: 0.54 seconds
    count: 498621

    $ python3 times_python.py slices
    parse time: 9.96 seconds
    count time: 0.40 seconds
    count: 498621

    $ python3 times_python.py isofmt
    parse time: 0.80 seconds
    count time: 0.38 seconds
    count: 498621




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: