Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I feel like any field of research has a base set of knowledge and skill sets required to do high level research. One could say that biology research is filled with "Microscope Bullshittery" or Paleontology is filled with "Fossil Digging Bullshittery".

A base skill of doing computer science research is programming.

Can you program without computer science? Absolutely. Can you Computer Science without programming? I would say no. Being able to look at and understand how to use a language or library is just something required to get to the last tier of computers science knowledge. I think every researcher would love to get rid of their bullshittery, and often they have lab technicians or interns do it for them but in the end they all had to pay their dues and have to know it in order to mentor and help those below them.



The author specifically calls out that he's not talking about programming, per se. He's talking about the skill set of wrestling useful free software packages to one's own aims:

So perhaps what is more important to a researcher than programming ability is adeptness at dealing with command-line bullshittery, since that enables one to become 10x or even 100x more productive than peers by finding, installing, configuring, customizing, and remixing the appropriate pieces of free software.

I'm torn about this article. Clearly this researcher, in his role as mentor, has identified a skill gap that's hindering his students. And it's perhaps even a problem that the software community can ease the pain of. But many of the things he lists in passing get down to fundamental tools of software work: version control, package management, data manipulation, etc. Yes, the usage of these things on the command line tends to be "arcane", but that's because each is encoding its own problem domain. And if you're going to be working in software in any non-ivory-tower capacity, you'd better know this stuff.

I've dealt with this kind of problem numerous times before in various contexts with workflow tooling. I.e. a single (usually) command-line tool that neatly encapsulates the most common development use cases to reduce learning curves, cycle time, and errors. These can be phenomenally successful if done well, but if the context doesn't define a workflow (e.g. student A vs. student B's research ideas) then there's no easy way to encapsulate the user's problems.


> Yes, the usage of these things on the command line tends to be "arcane", but that's because each is encoding its own problem domain.

Not necessarily. I came to believe lately that Git for instance, has beautiful, simple, and powerful core principles… and an unacceptably crappy user interface. A hashed DAG of commits with a few pointers to navigate it, that's great. But the magic incantation that you're required to type on the command line are too complex, unintuitive, inconsistent… and intellectually utterly uninteresting.

Git's core model is the interesting part, the one that will make you a better programmer, or computer user, or whatever it is you want to do that involves version control. But the specifics of the command line interface? That's neither interesting nor a portable skill. "Command line bullshitery" is a perfect term to describe it.

Why I believe that has been said better than I can ever do here: http://tonsky.me/blog/reinventing-git-interface/

Seriously, even "end losers" could use this. I also believe this can be generalised: some software just isn't usable through the command line. For day to day interactive use, it needs a neat, special purpose graphical user interface —Bret Victor has taught us how powerful they can be.

The command line is still invaluable when interacting with other software, or for automation. Then it should be designed for those, not for interactive use. Simply put, it should be an API —which you could use to build your wonderful GUI on top of.


Paul Graham has talked about the pain of installing software, and every time I have to do it, I always have trepidation. "Is this going to be the time apt-get barfs at me?"

Software installation is still a big pile of bullshit. For the people who spend their time deep inside one ecosystem, it can be okay, but most people have something to do besides live deep inside one ecosystem.

A few weeks ago I was just trying to find a JavaScript minifier on a Linux VM. So I googled and spent an hour digging through various pieces of crap, incompatible versions of libraries, asinine "gem install" error messages, and fun reading Stack Overflow answers saying things like "why didn't you have lib-foo-fuck-you installed already?"

And none of this is valuable for me to learn because in five years all the current package maintenance stuff is going to be thrown out and replaced. Not necessarily by something better (although that's likely the hope, leading to http://xkcd.com/927/ ).


He also goes on to say,

"Throughout this entire ordeal where I'm uttering ridiculous epithets like 'git pipe fork pipe stdout pipe stderr apt-get revert rollback pipe pipe grep pipe to less make install redirect rm rm ls ls -l ls tar -zxvf rm rm rm ssh mv ssh curl wget pip,'"

In other words, "ridiculous epithets" seems to be equivalent to telling the machine to do something. Have you got a way to get git to control your source without actually invoking git?

Workflow tooling can indeed be incredibly useful, but the context isn't the only requirement for success. If something underpinning that tooling changes or breaks, someone is going to have to understand what happened.

The people who regard that understanding as "ridiculous" are the worst people to work with and to my mind are the primary reasons that this "profession" gets little respect.


> Have you got a way to get git to control your source without actually invoking git?

No, but git does involve ridiculous epithets. No quotes, because I'm dead serious. As an interface, Git command line is laughable, and doesn't deserve a passing grade. Yes, it's the only one we've got. Yes, many interfaces are even worse. Still, that's no excuse. We can do better. Hopefully someone will: http://tonsky.me/blog/reinventing-git-interface/

---

Let's take a simpler example:

  $ tar -xzf foo.tar.gz
So, you have to learn the name "tar". The option "-x" for extract, the option "z" for gzip, and the option "-f" for file (by the way, the "f" must come last, or it won't work). What the fuck is this retarded interface?

First, why do I have to tell tar to extract the thing, since it's obviously a compressed archive? Why do I have to tell tar that it's in gzip format? It can decompress it, surely it can check the compression format? And why, why, WHY do I have to tell it I'm processing a file? It KNOWS it's a freaking file!!!

Surely there must be an alternative, like… like…

  $ decompress foo.tar.gz
I personally don't know of such alternative, and don't use them, because I was retarded enough to learn the basic 'tar' incantations by heart. Now that I know them, I can't summon the courage to use a decent interface instead.

But you see my point. Even for a simple tool such as tar, the common use case involves a cryptic incantation that shouldn't be needed in the first place. I'm sure many other UNIX tools are like that, I just didn't think about critiquing their interfaces thoroughly. Yet.

The core principles of the command line are very good. The level of relatively easy automation it provide is nothing short of amazing. This technology for the 70's is arguably more capable than most graphical interfaces in current use. But it does have its fair share of incidental complexity and useless quirks. We can do better. Let's not forget that.


Lets take your tar example. The -z and -x options are flags, they specify binary on/off options. You can specify all the flags separately on the command invocation like so:

  $ tar -x -z -f foo.tar.gz
However typing -flag -flag2 -flag3 is too many keystrokes so an a convenience you can combine all the flags in one call -xzf. The -f isn't a flag though it takes an argument which in this case is the filename foo.tar.gz. The argument is required and comes directly after the flag. Which is why the f has to come last because that argument has to come right after. Order doesn't matter for the x and z because they don't take arguments they are just flags. It makes sense if you add in another flag like -C which also takes an option you would end up with:

  $ tar -xzfC foo.tar.gz directory_to_change_to
Which argument goes to which flag? Maybe the first flag gets the first argument? Then your argument order changes if you type in the flags backwards.

I don't know about your z flag, GNU tar doesn't need it. The x flag is needed because tar can do things other than extract like list the contents of the archive with the -t flag, or create a new archive with -c.

Finally why is the f command required? My first assumption was that maybe because you need to specify the output file when you are creating an archive. I took a look in the manpage and the reason is a lot more interesting.

  Use archive file or device ARCHIVE.  If this option is not given, tar will
  first examine the environment variable `TAPE'. If it is set, its value will
  be used as the archive name. Otherwise, tar will assume the compiled-in
  default.
I knew that tar's name comes from the phrase tape archive but I hadn't put two and two together. Of course you need to specify if you are writing the archive to a file because tar was created to back up data to tape! If you think about it tar is actually doing the "right thing". Considering why it was written tar has a sane default, write the data to the tape drive.

Maybe you already understand all this and I'm reading too much into your simple example. It feels to me though that when people have issues with something like the unix command line its because they just wanted to get something done and memorized an incantation to do it. There isn't anything wrong with that of course but a tool like tar is SO much more powerful than just decompressing files. Once you start to dig into it though there is an internal consistency and logic to it though.


> Maybe you already understand all this

Yes I do. Every single item. I just feel for the hapless student that is required to send a compressed archive of his work to the teacher, and is using tar for the first time.

There's only one little exception: I didn't know GNU tar doesn't require the '-z' flag (which by the way means 'this is a gzip archive') when extracting tar.gz archive. Anyway, I bet my hat that the '-z' is required if you compress something and output the result to the standard output: there will be no '.gz' hint to help the tool magically understand you want it compressed. If you omit it, tar will likely not compress anything.

The '-f' option is the most aggravating. Nobody uses tapes any more. Tar was doing the right thing, but no longer. -f should be dropped, or rendered optional, or replaced by '-o' for consistency with compilers… As it is, it's just ugly.

> It feels to me though that when people have issues with something like the unix command line its because they just wanted to get something done and memorized an incantation to do it. There isn't anything wrong with that of course […]

Actually there is. The users want to do something (usually a very common case such as compressing or decompressing an archive), then they have to memorise an arcane incantation. Yes, tar can do much more. Yes, the command line is powerful and flexible and more. This is Good. (Seriously, I miss my command line whenever I have to touch Windows.) On the other hand, some common cases are just not well handled, and it is full of idiosyncrasies that have nothing to do with the aforementioned benefits.

When the user wants to decompress files, it should not be more complicated than 'decompress archive.tar.gz'. Though thanks to an uncle comment, I now know of the 'unp' tool, which works just like that: 'unp archive.tar.gz', and you're done. (And the man page is refreshingly short.)


You don't specify -f to tell it you're processing a file, you specify -f to tell it that the next argument is the filename. And it doesn't have to come last.

    tar -z -f foo.tar.gz -x
That's a perfectly valid tar command. Also, obviously you have to tell it that you're extracting the file. How else would it know that you don't want to create an archive?


You're the second commenter who believed I didn't know this stuff.

I know that, and more. But go and explain each and every flag to a student that just want to extract the first lesson of his first UNIX course. At this point, this is all magic and arbitrary.


> $ uncompress foo.tar.gz

Try unp, it's in the repo.


Told you we could do better! Seriously you just made my day. I'll use this from now on.


I think the biggest issue is that beyond the setup phase of development, these tools don't get used by the new researchers the author works with. If I'm developing a new program, having to run ten programs I've never seen before just to get started can be frustrating if I won't be actively using as them I work.

Programmers should learn the tools to stay efficient. Version Control, build tools, etc. are priceless. But if you force feed too much at once, nothing will stick. Couple that with what for some may be their first time on the command line, and you have a recipe for bullshittery.

I'm having a hard time understanding other commenter's grief that explaining git to someone who's never typed "ls" before is anything less than bullshit to slog through. These things are best learned one or two at a time.


> Clearly this researcher, in his role as mentor, has identified a skill gap that's hindering his students.

This is a well known pre-course prep step, part of a bigger to do list for all teachers to make sure the tools you suggest to the students are bundled to be setup in an easy way for your target audience.


I don't think he's talking about courses here. He's talking about his role as an advisor, not lecturer.

This issue comes up all the time with young researchers (i.e. graduate students). There are a huge number of free and open source packages that can help them implement and test their ideas, but actually getting them to work together can be an exercise in yak shaving.

That being said, having facility with command line tools is a valuable skill for any researcher.


Paraphrasing some actual experiences I've had: I want to install GNU Guile 2 on my Mac laptop so that I can write a prototype of an AI program. To install Guile 2 I need to install some prerequisite library. The prerequisite library won't build with the version of GCC I have installed. The easiest way to upgrade GCC is to get the newest version of Apple Xcode tools. The newest version of Xcode tools requires the latest version of OS X. But I also run Avid Pro Tools for music production on this computer, and the latest version of OS X is not clearly compatible with my version of Avid Pro Tools. So I'd need to pay $300 to upgrade Pro Tools so I can upgrade OS X so I can upgrade Xcode so I can upgrade GCC so I can build a library so I can install Guile 2.


Yeah, I've definitely felt that pain as well.

In this case, I'd suggest using a VM driven by Vagrant, unless you really need to be running native under OS X. That provides an isolated and repeatable environment, but at the cost of learning whole other domains of experience. My suggestion also hugely reinforces Dr. Guo's point: we've perhaps solved a problem by adding piles of additional tooling layers: (vagrant CLI, Vagrantfile interface, VM domain knowledge, setting up a Linux host (even as a toy environment), setting up a Linux host as a build environment, etc.) Heck, if we're doing it right it'd be nice to use a provisioning tool to automate the VM and build environment setup. All of this stuff is awesomely powerful, but front-loading a student project with it is nuts unless there's a domain expert who's building this tooling for them and coaching the students through it. Again, that's not the sort of thing that's likely to happen for per-student research projects.


Isn't that simply one of the downsides of choosing to work on OSX?

Or maybe `brew install guile`?


Well, yes and no. If you want to look at something very small, you have to learn how to use a microscope. It's not that hard to learn how to use a simple microscope, and you can make adjustments by thinking about the physical principles, which are universal. And if a certain type of microscope isn't suitable, you can always change or someone can design a better one.

The *nix command line is not based on physical law. There is no mathematical or physical reason that human-computer action has to be through an underpowered, user hostile design that requires lots of unnecessary memorization. People use it because everyone does. You can come up with a better design, but you'll still have to use the old design if you want to interact with anyone else.

Edit: By "underpowered", I mean that passing byte streams and parsing them is less powerful than passing objects. And this is made worse by the fact that different commands have different output syntaxes.


If the command line were underpowered, it would have been abandoned decades ago. It's still around precisely because it's incredibly powerful, and nobody has come up with a suitable replacement that doesn't involve sacrificing a lot of power.

Further, all of the "No, obviously this would work much better!"s have been tried already. Dozens of times, if not hundreds. Some of them you can even download right now. Nevertheless, the command line persists. It may not be perfect in every detail, but it's far harder to exceed than it seems at an initial, frustrated glance. If you hope to someday so replace it, it would be wise to understand exactly why that is, lest your effort completely fail in exactly the same way as so many prior attempts.


It's not that the commandline is necessarily underpowered, imo, it's that the discoverability and intuitiveness of the *nix userland is pretty low. Man pages are a great way to document every flag a program accepts, but are a terrible way for someone to figure out some simple use case for the program. Similarly, flags are named without any regard for intuitiveness (-l is liable to mean different things for any program you encounter, and the precise meanings one program decided on can be hard to remember because there is no explicit commonality with any other tool).

In that sense, it absolutely is a user hostile design and it involves lots of unnecessary memorization.


It's unintuitive if you come from Window-GUI-land. It's plenty intuitive that should consult the manual within UNIX-land.

Remember, intuition is not some inborn instinct. It's a product of training. To a user of MULTICS, iOS is wildly unintuitive.


Yes, by no means consider my post a defense of every quirky detail of history. My point is more that one must start by correctly identifying the problems before one could hope to solve the problems, and "lack of power" is definitely not it.


The command line is the heart of Unix and Friends, which is why it wasn't abandoned decades ago. And it's almost impossible to replace the *nixes because of network effects and extraordinarily high costs.

The command line is "powerful" because it is "simple" in the sense that it doesn't really do anything for you. Commands have their own input and output syntax based on their needs and it's up to the user to figure out how to fit it together. I think one having standard serialization format, so that you wouldn't have to waste time learning and thinking about each command's special little syntax would be much more powerful.


I use command lines maybe once a month. For almost everything, it has been replaced. For those places it hasn't, I blame the tool for not automating some process that could have easily been automated.


I absolutely live at the command line, and when I'm forced into a walled garden I blame the tool for not properly exposing its functionality to the rest of the system so I can easily automate it the way I want it automated.

Edited to add:

Note that this isn't a claim that anyone else (in particular) is Doing It Wrong - it's what works for me, I'm sure there are other points in the space that work as well or better for others. That, itself, should not be taken as a claim that every point in the space is equivalent - that's not true either. What I do strongly refute is any notion that the command line is "under powered" in general.


Used to be me. But I got sick of remembering all the arcane stuff. Now its a pulldown menu away in an IDE.


That reads a bit odd.

"I used to be fluent in French. But I got sick of remembering all the arcane stuff. Now it's just pages away in a phrase book."

Probably you were never really fluent in French in the first place. Which is to say, you were never really entirely comfortable at the shell. Which is fine - as I said, I don't assert that it's the best fit for everyone. But for me, just like producing English doesn't feel like I'm "remembering arcane stuff" neither does producing Bash, even though I fully recognize that objectively both are plenty weird.


That's just wrong. I've written shells. I've written tools. I started in this business before IDE's existed. Some folks grow out of it, some stay because its so cool to know what all those switches mean.


"That's just wrong."

Please explain, rather than simply asserting.

"I've written shells. I've written tools."

I've written GUIs. That's pretty well unrelated to whether the interface fits you well.

"I started in this business before IDE's existed."

I don't see how that undermines my point. It would explain why you wound up using the shell despite it being a poor fit for you.

"Some folks grow out of it, some stay because its so cool to know what all those switches mean."

Veiled insults don't make your point stronger.


What you said. Its wrong. I'm not like that.

I was a command-line master. Because you had to be, before IDEs.

And the last comment was to match the tone of the parent comment.


"I was a command-line master."

Okay.

"And the last comment was to match the tone of the parent comment."

Except you didn't match the tone of the parent comment. You said, "You're juvenile and only prefer the command line because of pride." The parent contains nothing similar; the nearest I see is a reasoned guess at your level of comfort using a tool - not even your level of skill with the tool - accompanied by an explicit statement that it's not intended as a judgement. Even if it's wrong, that's not an insult.

If this is the level of discussion I'm going to be getting, I'm done with this thread.


I guess I was put off by the glib dissection of my skills and personality. Maybe it didn't seem that way when written, but it sure did when read. Its call 'ad hominem' and I responded in kind, which was probably not very cool. I apologize.


Some days I wish someone would have already created a tool to do all of the stuff I needed to do.


They did, you just have to do a bunch of command-line bullshittery to get it to work.


That day, no one will pay you to do what you do.


> (...) underpowered, user hostile design that requires lots of unnecessary memorization.

User hostile I can understand. Underpowered is just plain false. Parsing log files, relating data between them and aggregating results is a common task for any decent sysadmin. I know no interface as powerful (measured as the ratio of information quality over time spent) as the Unix command line.

Many powerful interfaces have ultra steep learning curves. The Unix shell is one of them. Steep learning curves are a flaw, of course, but they do not invalidate the other qualities of the interface.


>an underpowered, user hostile design

In your opinion, of course. Many of us find the * nix command line not only elegant and highly-productive, but actually enjoyable.

In fact, the only Linux command line tools that I did not immediately start using in a highly productive manner are `tar` and `find`, arguably two of tools that least abide by the ideals of * nix command line tools (I've since gotten used to `find`, but `tar` may always send me searching for my personal wiki's corresponding entry[1]).

If you want to argue that there are better ways to program on the console, I grant that it's possible -- although as mentioned, I find the composability of * nix tools to be an almost magically productive approach.

But to condemn anyone to a life of GUI tools is not only going to drastically increase their chance of developing carpal tunnel syndrome, but also inevitably will slow down their work flow -- often drastically. It's simply not possible for even an experienced user to point and click with a mouse as fast as an fluent typist can issue commands on the terminal.

1. http://xkcd.com/1168/


I too love the CLI, but I wish someone would go through and standardize flags, naming, ordering concerns, long-form arguments, etc across all of the POSIX tools. GNU put in some effort here, but in my opinion they didn't go far enough and non-GNU operating systems (BSD, OSX, etc) are missing out.

Actually, as much as I dislike Apple, they're probably one of the few entities that could pull off something like this.


Honestly I feel like most Unix command-line tools are pretty good. I can make a good guess what "-n" and "-i" will mean on a new Unix tool based on context.

The git commands are a big exception; it's like each individual git command was written by its own committee, each deep with NIH syndrome. The one example I remember is two different git tools having a different way of colorizing their output.


I so agree with you. The only thing I hate about the *nix userland is that all tools follow their own set of arbitrary concerns regarding the things you mentioned. Really limits the intuitiveness and discoverability of the system.


I don't think Apple could. Their influence on serverland is virtually nil.


Part of the reason that people still use the *nix style interface is that the large number of people who thought they'd come up with a better design were actually wrong about that. It's harder than it looks.


> Can you Computer Science without programming?

The bullshitery he's talking about isn't programming, but compiling, installing and setting up various software that are still in an early stage of development and aren't user friendly yet. Just something you have to do but isn't related to programming or research.

Somehow, I agree that an all-around computer scientist has to know how to deal with this, but it's probably not the best way they can spend their time.


One of the neat things about computer science research is that researcher B's work is frequently based on researcher A's software and A's software is "academic", which means that it actually works on exactly the three examples in A's dissertation.


You can totally "Computer Science" without programming. It's a big field, and a big portion of it is pretty much math. You probably cannot "Applied Computer Science" without programming.


I just replied in the context of the article. I'm assuming if you are doing math proofs all day you probably aren't dealing with "command line bullshittery".


I initially thought the same thing. But then I considered that many of the theoreticians probably write their papers in LaTeX.


I wouldn't generally call that programming.


But it does involve command-line bullshittery which is more what the article is about.

The article isn't bemoaning the state of programming languages or their tooling necessarily---more all the steps to get a programming language toolchain off-the-ground and working (like an Android dev environment etc. or a cross-compiling environment).


Certainly the case. My comment was an aside directed at its parent, not deeply relevant to the article itself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: