Runtime Support for Multicore Haskell: A Retrospective

gautamcgoel · on Dec 17, 2019

Nice write-up. Haskell and GHC have definitely been enormously successful platforms for PL research in general, and research in parallelism in particular. However, one unfortunate side effect of all that research effort is that there are now at least half a dozen libraries for parallelism and concurrency (STM, Async, Control.Parallel.Strategies, etc), to the point that it is hard for a software engineer to grasp the tradeoffs presented by the various options. I feel like I succumb to decision paralysis when I attempt parallel programming in Haskell; from a purely pragmatic perspective, I much prefer the Go approach, where there is a single canonical concurrency mechanism which is baked right into the language and is extremely well-documented.

wostusername · on Dec 17, 2019

Been a while since I've done Haskell, but the essential breakdown as I understand it is this:

At lowest level you have the Control.Concurrent primitives: forkIO and MVar. forkIO allows you to spin up new Haskell threads (not OS threads) and MVar's can be used to communicate between them. MVar's can be empty or full and block when the operation can't be completed immediately (writing to a full MVar or reading from an empty MVar). They also have single wakeup semantics, so if you have multiple readers blocking on an empty MVar read, only one will be woken up when someone writes to it.

STM has transactional semantics. You can write to multiple STM variables atomically in a transaction, and your transaction will restart if any of your input variables have been written to while your transaction was running. Useful if you need to maintain consistency between several variables at all times.

Async is a wrapper around STM and provides common functions you would often write yourself to express common patterns, but more battle tested.

Strategies are about optimistically evaluating Haskell thunks in parallel. For example if you map a function over a list, now you have a list of thunks. You can use Strategies to now evaluate those thunks in parallel. Key thing with Strategies is that if you remove them from your code (or use a non-threaded runtime) it doesn't affect the semantics of your program, it just might make it slower due to loss of parallel evaluation.

The Par Monad isn't used much, but the key idea there is you build an explicit dependency graph of the values in your computation and the runtime tries to parallelize the nodes in the graph that it can. Similar to what make does if you run it with -j.

verttii · on Dec 17, 2019

Thanks, excellent summary!

th0114nd · on Dec 17, 2019

I don’t think this an entirely fair comparison. STM provides essentially primitive operations, Async just builds a higher level API on top of them for common pattern. Not unlike the relationship between errgroup and go routines.

Strategies solve a different problem, in giving control of evaluation order of pure values. STM is aimed more at traditional multi threading

vmchale · on Dec 17, 2019

Also repa and accelerate :)

> I much prefer the Go approach, where there is a single canonical concurrency mechanism which is baked right into the language and is extremely well-documented.

Haskell doesn't have everything but I'd prefer something other than channels for concurrency. Actually, I like the Rust ecosystem in that regard (though it's missing something like accelerate...)

mark_l_watson · on Dec 17, 2019

Really nice paper, especially the bullet points near the end.

I applied for a job with Simon Marlow’s Haskell team at FaceBook about three years ago. I really enjoyed talking with him in the phone, but it was obvious that I didn’t have sufficient Haskell skills.

There is a lot of good synergy between meeting FB’s platform requirements and research and development for the Haskell ecosystem.

xvilka · on Dec 17, 2019

Sadly, Multicore OCaml [1] is nowhere near [2] completion yet. Nor it moves fast. There are still a lot [3] of issues to solve and sync with the mainstream OCaml until it will be ready for anything basic. I write OCaml mostly these days (along with Rust, C, Python, and a couple less mainstream languages), but often miss the ease of parallelization with Haskell, code clarity, and so on.

[1] https://github.com/ocaml-multicore

[2] https://github.com/ocaml/ocaml/labels/multicore-prerequisite

[3] https://github.com/ocaml-multicore/ocaml-multicore/projects/...

pdimitar · on Dec 17, 2019

Truthfully, I'm waiting for Multicore OCaml and I'll only use it, Rust and Elixir.

OCaml's typing system is in my eyes unbeatable. Still considering making a transpiler from it to Rust and Elixir but I'm grossly unqualified to write compilers. :(

Haskell though, I couldn't grok it. Seemed to introduce a lot of tension for very basic topics -- like strings -- and it lost me.

vmchale · on Dec 17, 2019

> Haskell though, I couldn't grok it

If you know OCaml it's within reach. The advantage is that you have better lazy data structures with GHC.

_verandaguy · on Dec 17, 2019

As someone who learned it casually a while back: Haskell's not great for strings, I'll give you that.

How does the OCaml type system compare with Haskell's though?

atombender · on Dec 17, 2019

The big distinguishing difference is that OCaml doesn't have ad-hoc polymorphism and function overloading. You cannot have a function that operates on many different types; which is why you have two sets of arithmetic operators for integers and floats ("+" vs "+.", for example). In theory you can use modules for this, but it's awkward, and I don't think they're generally used this way. (OCaml is getting implicit modules, and this may help the situation.)

In Haskell, typeclasses make this super easy. It allows you define a family of functions/operators that together become what a Go or Java developer might call an interface. Code written to use a particular typeclass will work for any implementation of that typeclass.

On the other hand, OCaml's module system is very powerful and has no counterpart in Haskell. It allows you to package a whole type implementation into a single module that has a public signature, and can be instantiated with generic type parameters.

chongli · on Dec 17, 2019

I don’t know anything about OCaml so please forgive me if this question seems silly.

two sets of arithmetic operators for integers and floats

What if I want to create more numeric types? Say vectors in R^n, for example. Do I need to create more operators for vector addition and scalar multiplication? How do I make sure the user can’t add vectors of incompatible dimension?

atombender · on Dec 17, 2019

Yes, you need to add new functions. You cannot redefine "+" to work a custom vector type.

Similarly, you cannot have a "print" function that works for all sorts of types. OCaml has print_string, print_int, and so on.

OCaml does have parametric polymorphism ("generics"). You can write functions which support generic parameters. And you can use modules as "functors" to sort of package up types in a similar way as Haskell typeclasses (e.g. see [1]).

[1] https://stackoverflow.com/questions/14934242/whats-the-close...

vmchale · on Dec 17, 2019

> OCaml's module system is very powerful and has no counterpart in Haskell.

Is this taking into account backpack?

atombender · on Dec 17, 2019

I'm not familiar with Backpack, but reading about it [1], it looks like its goal is indeed to provide an OCaml-style module system to Haskell.

[1] https://plv.mpi-sws.org/backpack/

kadoban · on Dec 18, 2019

Data.Text are fine strings really. Better than most languages I'd say, if not by a lot.

pdimitar · on Dec 17, 2019

Last I checked, they weren't that different. But I found OCaml's syntax more intuitive. Subjective, I admit.

Truthfully though, both languages have similar pains. You can do one thing in 10 different ways in both, for example, which isn't tempting.

But OCaml seems more terse, and its compiler is lightning-fast, so that tipped the scales for me.

marcosdumay · on Dec 17, 2019

> Haskell's not great for strings

What do you mean by that?

mic47 · on Dec 17, 2019

There are couple of implementarion of strings: * String - this is default, it is list of characters and it was a mistake. * ByteString and LazyByteString - this represents array of 8bit characters. Performant but does not handle encodings * Text and LazyText - this is what you want to use when yiu work with text, since it represent text in unicode.

There is mess in these types. s9me libraries are using different types, so you might have to convert between those and it can be confusing, and the conversion apis are not unified. But if you manage in your project consistently use single type (say Text), then you'll actually have a great time working with strings. Its problem if you want to quickly mash together few libraries that does not use same types, since it get messy.

Reason for this is mostly historic.

marcosdumay · on Dec 18, 2019

> ByteString and LazyByteString - this represents array of 8bit characters

Not exactly. ByteString has no relation at all with text or characters. It represents a sequence of bytes, AKA binary data.

I imagined I would hear complaints about too many text types when I made that question, but it sounded like the GP had a different complaint.

The library mess is real, but it's improving quickly and it's a bad reason to decide Haskell is a bad language to deal with text. It's throwing a lot of great stuff out just because of a minor inconvenient. And by the way, one of the best things on Haskell is that you can defined a lot more text types. For example, my current project has types for internal identifiers, HTML, HTML form values, JS, and SQL, they are all text but I can always be sure things are properly encoded.

vmchale · on Dec 17, 2019

The default String type is a linked list of characters, which means various things are slower than they should be.

marcosdumay · on Dec 17, 2019

Hum... Yes. String is there for historical reasons that turned out to be a bad fit to what Haskell became.

But there is no reason to use it. Text does everything String does.

vmchale · on Dec 17, 2019

> a couple less mainstream languages

I'm curious: what languages?

perlpimp · on Dec 17, 2019

"there’s a lot more to realising good parallel speedup than just choosing the right language"

from ye olde days - if you can't split up computation into nice isolated chucks point is moot, so relatively fine granularity of your algo stepping is the key here, lack or severe control of side effects means you don't get to make too many mistakes related to mechanics of running the computation.

zzzcpan · on Dec 17, 2019

> so relatively fine granularity of your algo stepping is the key here

Fine granularity requires pretty much zero overhead synchronization, which green threads or any shared memory multithreading implementations can't do, they need to spend like a few thousands of nanoseconds of useful work before synchronization costs become even just bearable.

chrisseaton · on Dec 17, 2019

> Fine granularity requires pretty much zero overhead synchronization

Yes, like a work-stealing scheduler. Many tiny tasks are fine as long as you keep them on one core. Other cores can steal a batch of them from the other end of your queue with minimal synchronisation every now and again.

zzzcpan · on Dec 17, 2019

No, work stealing is pretty high overhead.

chrisseaton · on Dec 17, 2019

That's not my experience. I think work stealing is low overhead as it amortises synchronisation, only does it at all when actively required to get more work, and reduces conflicts due to two ends of the queue. Why do you think it's high overhead?

zzzcpan · on Dec 17, 2019

It's only one part of the story (scheduler) and not even good enough idea considering non-locality overhead.

chrisseaton · on Dec 17, 2019

Isn't work stealing excellent for locality? Jobs stay on the same core until there's a need to steal, and then the most likely non-resident jobs are taken. Is it even cache oblivious?

BubRoss · on Dec 17, 2019

Why would it be high overhead if work is only stolen when a core has run out of work?

earthboundkid · on Dec 17, 2019

Huh, it's almost like the idea of pure functional programming is all hype and it doesn't actually make performant concurrency any easier to implement…

zadler · on Dec 17, 2019

Did you read the article?