Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> every programmer and especially every sysadmin should learn

There are lots of things "every <tech position> should learn", usually by people who already did so. I still have a bunch of AI/ML items on that list too.

What's the advantage of learning AWK over Perl?



> What's the advantage of learning AWK over Perl?

Getting awk in your head (fully) takes about an afternoon: reading the (small and exhaustive) man page, going through a few examples, trying to build a few toys with it. Perl requires much, much more effort.

Great gain/investment ratio.


another commenter said something similar - But nothing says you have to learn everything - you can learn a subset of perl that does everything you would want to do (with awk), would that take as long?


Yup, but defining that subset isn't free! Perhaps some people did the work already, but I'd still be cautious as to how much Perl one actually need to know to use those comfortably.


Both will get you where you want to go, but I don't think the usecase for perl and awk are the same.

I reach for awk when my bash-scripts get a bit messy, perl is/was for when I want to build a small application (or nowdays python).

But both perl and python require cpan/pip to get the most out of and with awk, I just nead awk.


Is there any particular functionality which does exist in awk, but doesn't exist in Perl or Python without third-party libraries? I've always found "Python + built-in modules" more than sufficient for my text-manipulation needs. (Also, it lets me handle binary data and character data in the same program, which is very useful for certain tasks.)


It’s just that awk has a concise syntax that can make for some really quick one-liners in your terminal prompt. Why spend a minute or two in Python if you can get an answer in 15 seconds instead?


> Why spend a minute or two in Python if you can get an answer in 15 seconds instead?

Because you (or someone else) can run your Python later if needed, and have confidence the output will be the same.

Sure, there are some times when a one-liner is needed, and you can always put that one line in a document for others to run. I can think of many times when I was on-call and needing to grep some data out of logs that wasn't already in a graph/dashboard somewhere. When time is of the essence, or if you're really really sure that you won't need to run the same or similar thing ever again, even if the data changes. I even changed my shell to make up-arrow go through commands with the same prefix instead of linearly traversing command history because I had so many useful one-liners I re-ran later.

But as I've gotten more experienced, I've come to appreciate the value of committing those one liners to code as early as possible, and having that code reviewed. Sometimes a really useful tool will even emerge from that.


I put off learning awk for literal decades because I knew perl, but then I picked it up and wish I had done so earlier. I still prefer perl for a lot of use cases, but in one-liners, awk's syntax makes working with specific fields a lot more convenient than perl's autosplit mode. `$1` instead of `$F[0]`, basically.


but, then couldn't you use "cut" as even simpler syntax?


`cut` doesn’t work natively on data that’s been aligned with multiple spaces, you need a `tr -s` pass first.

It also doesn’t let you reorder or splice together fields.

I used it for years but now that I have a working understanding of `awk` I have never looked back.


FreeBSD cut has -w for that ("split by any amount of whitespace"), but that never made it into GNU cut. Sad, because it's mega useful.

Of course awk can do much more, but if all you want is "| awk '{print $2}'" then "cut -wf2" is so much more convenient.


Reordering and splicing are common enough that it’s easier just to always use awk, since the cost of rewriting one to the other is significantly higher.


Maybe if all you want to do is unconditionally extract certain columns from your data. But even in that case cut doesn't let you use a regular expression as the field delimiter.


- Awk is defined in POSIX

- Awk is on more systems than Perl

- Awk has more implementations than Perl


POSIX not really relevant, more systems? Debatable. More implementations could be seen as a negative.

Perl is more regular than Awk for the simple cases and is more usable for anything that isn't merely iterating over input.

Of course, you shouldn't any of awk/perl/shell for tasks that aren't being run by you or are over say 20 lines long.


awk is also a much smaller language than perl, so it's generally less effort to teach, learn, and read.


Is it not possible to learn a subset of perl?


Learning any language more or less starts with learning a subset of it.

Asking a new hire to "learn awk" vs "learn perl" have two very different time investments attached to them.

Tasking someone with "learning a subset of perl" begets the question "what subset?", and a very exhausting conversation with someone(s) routinely asking "so?" follows. After spending a large amount of time re-litigating which subsets of perl features we want that awk already supplies.


Which subset, and how do you ensure that every example you come across and everyone you work with sticks to that subset?


> Awk is defined in POSIX

so?

> Awk is on more systems than Perl

By what metric?

> Awk has more implementations than Perl

so?


Whatever you think my opinion is of Perl you're probably wrong and the tone of your advocacy is kind of odd.

Awk is older and as a part of POSIX the version found on unix-like environments will be (outside of extensions) compatible with others. If one or one without the extensions you want isn't present you can choose an implementation, even one in Go and it'll work.

Perl, and I've been writing Perl since Perl4, doesn't have those characteristics. It's a much more powerful language that has changed over the years and it is not always present by default on a unix-like system. Because the maintainers value backward compatibility, even scripts written on Perl5.005 have a fair chance of working on a modern version but it's not assumed (and you shouldn't assume anything about modules). Because Awk is fossilized, you can assume that.


The first and last items in your list provide no reason why they are relevant, there is no "tone", nor "advocacy" - it's not "odd" to ask for that context, as given here.


Awk is found in small-ish embedded systems that don't have no reason to waste space on Perl or anything like it.

One reason for this is that the popular BusyBox project includes an Awk implementation: BusyBox Awk.

Pretty much everywhere there is BusyBox, there is an Awk, unless someone went out of their way to compile it out of the BuxyBox binary.


Every linux system comes with awk already on it. Perl has to be installed, and might not be available on a system you don't control.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: