> There is a reason why "wc" is not multithreaded: it just can't. It must work s...

foxhill · on Oct 16, 2019

> Eh what? Word counting is embarrassingly parallel, at least for files.

wc does not make the presumption that it is working on a file that can be seeked into (and, more importantly, re-wound). in c++ terms, it behaves as though the source it is reading from is an "input iterator"[1].

if you are piping into wc, then you'll never be able to parallelise it. word counting a file is the edge case here, not the general use.

[1] - https://en.cppreference.com/w/cpp/named_req/InputIterator

vidarh · on Oct 16, 2019

You don't need to seek: Spin up workers, have them wait on a mutex, and read blocks. If the producer is slow, then you won't need to feed multiple workers. If it's fast, it's a matter of finding the tradeoff between block size vs. the processing time per block.

foxhill · on Oct 16, 2019

i expect you're being downvoted because it's likely that the cost of deciding which thread to assign work to is an order-of-magnitude more expensive than accumulating the reqiured counters.

in any case: you're right, it is of course possible. what i should have said was:

> if you are piping into wc, then you'll never see a performance gain from any attempts at parallelisation.

vidarh · on Oct 17, 2019

Well, I'm back at 1 point now. In any case, the amount of work needed to assign to a thread is trivial. For a small number of threads, it at most requires a couple of mutex operations and a couple of load/stores to distribute each block. Given the block size can be set arbitrarily high, the overhead of work assignment per byte can be made arbitrarily low. The biggest issue is that you won't know the input size, and for small input sizes there will be a cutoff point where multiple threads adds cost. But for small input sizes, performance won't be an issue anyway.

EDIT: Actually, the simplest way of assigning work is probably just to have each thread try to acquire a mutex, call read(), and then release the mutex and do their work.

Noe2097 · on Oct 16, 2019

> work counting is embarrassingly parallel, at least for files.

True; and "wc" works not just on files. "wc" also works on character devices, on streams of data that cannot be randomly accessed.

heinrich5991 · on Oct 16, 2019

I think your parent meant that you can't do that if you get your data from a pipe. You can't `lseek` on a pipe.

taneq · on Oct 16, 2019

The sentence before your quote explains why that's not always an option.

(Of course wc could switch to a multithreaded algorithm while working on files, but the point did make sense.)

easytiger · on Oct 16, 2019

Pipes.

Used Unix every day for 20 years and I'd say 90% of WC use cases is read from a pipe