There are two conflicting implementations of a parallel utility. And from what I can tell, the GNU parallel utility is much more useful than the one in moreutils. Which meant that when I was 1) doing processing which benefited greatly from parallelization and 2) found that the moreutils version wasn't doing what I wanted nor could I figure out how to make it do so (compounded by confusion over online searched providing GNU parallels syntax which didn't work), I had to remove the entire moreutils set to install GNU parallels under Debian.
The two versions aren't even a candidate for /etc/alternatives resolution as the commandline syntax and behavior differs.
Either a name change or refactoring to a different package for the 'parallel' utility would avoid much of this.
And I'd really like to see numutils packaged.
Also: 'unsort': sort -R | --random-sort
(using GNU coreutils 8.23)
(I'm not familiar with a seed-based randomized sorting utility though.)
Joey's apparent resistance to simply splitting out 'parallel' to its own package is ... disappointing. His final comment (regarding other utilities in upstream and switches) is non sequiturs and red herrings.
Where boneheads like Joey get to block trivial fixes for decades (5 years and counting in this case). The project really needs a better process to terminate 'lame' maintainers.
I find it entertaining that bash (zsh, ksh, fish, etc) itself provides ways to do what many of these utilities do. The anonymous pipes, named pipes, and process substitution mechanisms can replace many of these tools.
For example:
pee ->
some_process | tee >(command_one) | tee >(command_two) [...]
# This one might need a bit more magic with named pipes to consolidate the output without race conditions, since command_N will be executed in parallel. Or take a note from the chronic replacement below and use a temporary file to execute them serially.
chronic ->
TMPFILE=$(mktemp) some_process 2>&1 > $TMPFILE || cat $TMPFILE; rm $TMPFILE
zrun ->
command <(gunzip -c somefile)
Still, having a utility to abstract away the pipes makes sense.
These are cool, and I use chronic all the time. But is there any more documentation beyond this page? I can't find any, and I'd love to read more about pee and see some examples. It seems there is more documentation for the rejected utilities than the accepted ones!
When you clone the Git repository and build the package all the
utilities come with manual pages built from DocBook, e.g. for chronic:
$ man ./chronic.1 | col -b | grep -v ^$ | head -n 12
CHRONIC(1) CHRONIC(1)
NAME
chronic - runs a command quietly unless it fails
SYNOPSIS
chronic COMMAND...
DESCRIPTION
chronic runs a command, and arranges for its standard out and standard error to only be displayed if the command fails (exits nonzero or
crashes). If the command succeeds, any extraneous output will be hidden.
A common use for chronic is for running a cron job. Rather than trying to keep the command quiet, and having to deal with mails containing
accidental output when it succeeds, and not verbose enough output when it fails, you can just run it verbosely always, and use chronic to
hide the successful output.
0 1 * * * chronic backup # instead of backup >/dev/null 2>&1
More people might use this if the author put those docs online and linked to them from the main page. I don't know docbook but it looks like styling it as HTML should be easy.
Now I understand pee. I wrote something less generalized here:
> Unlike a shell redirect, sponge soaks up all its input before opening the output file. This allows constructing pipelines that read from and write to the same file.
GNU Emacs has something like that in Dired. It's called Wdired (writable dired) and allows editing the Dired buffer and then applies the changes. Think of it as editing `ls` output.
Well, yes, because git repo > source release so it's clearly not equal.
I just checked; the repo tags releases in some reasonably proper manner. (Personally I prefer some prefix like "release_0.2" to a tag simply named "0.2", but it does the job.)
>because git repo > source release so it's clearly not equal.
No, they have different uses. I simply want to install the software, so I want a source release tarball. Source releases include more than what a git repo provides, such as pre-built configure scripts and Makefiles. A tag in a git repo is no substitute for a proper release.
That's true for something that uses e.g. autoconf, but moreutils doesn't build any makefile or configure script, it's right there in the Git repository. So I see what your objection is for packages in general, but it doesn't apply in this case.
We package moreutils in GNU Guix, and it's much more preferable to download a source tarball than have to clone a git repo, so we download the tarball from Debian. We clone the git repo when there's no other choice, but it's far from ideal.
I created just such a utility several years ago. It's called rlimit and is basically a command line interface to the standard getrlimit() and setrlimit() unix calls. You can find it here http://freecode.com/projects/rlimit. I'd be happy to move the source to GitHub.
ulimit can read and set limits for the current shell. rlimit set limits for a child process. Admittedly you could do nearly the same thing by setting limits with ulimit, running the target command and then resetting the limits to their former state or by running a sub-shell, setting the limits there and then running your command in that environment. For example:
(ulimit -d 1024; <command>)
Or you could do it in one normal looking command with rlimit.
rlimit -d 1m <command>
Plus rlimit can set things like real-time priority which ulimit cannot.
As I see no one has mentioned it, let me pipe in with one more text processing tool that is invaluable in our modern world - jq https://stedolan.github.io/jq/ the commandline JSON processor. Consumes JSON input and its power is somewhere between sed and awk.
There are two conflicting implementations of a parallel utility. And from what I can tell, the GNU parallel utility is much more useful than the one in moreutils. Which meant that when I was 1) doing processing which benefited greatly from parallelization and 2) found that the moreutils version wasn't doing what I wanted nor could I figure out how to make it do so (compounded by confusion over online searched providing GNU parallels syntax which didn't work), I had to remove the entire moreutils set to install GNU parallels under Debian.
The two versions aren't even a candidate for /etc/alternatives resolution as the commandline syntax and behavior differs.
Either a name change or refactoring to a different package for the 'parallel' utility would avoid much of this.
And I'd really like to see numutils packaged.
Also: 'unsort': sort -R | --random-sort
(using GNU coreutils 8.23)
(I'm not familiar with a seed-based randomized sorting utility though.)