Algebraic Data Types: Things I wish someone had explained about FP

tel · on Nov 19, 2019

While I really like these blog posts for keeping things light and intuitive, they're also sprinkled with claims that go a bit too far.

    > Arrays, Objects, Maps, WeakMaps, and Sets are all algebraic data types

This isn't really true on a couple levels. Algebraic Data Types are described through a small set of composition (data) or decomposition (codata) operations. Arrays, Maps, and Objects might be explainable in this way (though, really, we'd just be modeling them with ADTs). WeakMaps and Sets really cross the line though. WeakMaps, by their nature, need to appeal to something much more than just ADTs and Sets require "quotienting" to unify sets which contain the same elements in various ways. You can build the "raw stuff" from ADTs, but you won't have something that really behaves much like these types at all until you add in something extra.

The emphasis on sum types is huge, but it leaves out something critical: "JavaScript doesn’t have a lot of built-in features for sum types" is true of user-defined sum types (and related to it just not having types all together), but it has great support for a few special sum types: integers, booleans, characters! These sum types are important to talk about because it helps make clear that we already work with sum types all the time! Our basis for distinction and choice is built atop them in any language.

Finally, while products and sums get you some wonderful modeling capabilities, the place where Algebraic Data Types really shine is when you add two further capabilities: function types (exponentials) and recursion (fixpoint, which doesn't really map to high school algebra). Those are clearly built into JavaScript as well and would be great to discuss as well.

nicoburns · on Nov 19, 2019

I really, really hope JavaScript eventually gets support for:

- User defined Sum types - Pattern matching on these Sum types in switch statements - Blocks and things-that-are-currently-statements becoming expressions that evaluate to a value.

If it got those, then it would pretty close to the perfect dynamic language for me.

agumonkey · on Nov 19, 2019

Considering TypeScript ~success, python type annotations and even PHP looking into sum types it's not far fetched to think js may adopt them.

mc3 · on Nov 19, 2019

What is the benefit of pattern matching in a language where any value can be of any type at runtime? You can just pattern match by saying if (obj.type === 'foo') {}; if (obj.type === 'bar') {}; etc in JS, as long as you have set up the convention that you set the 'type' field, and practically this is just as good.

The real benefit of pattern matching is the compiler checking at least that you have valid cases, and ideally that you haven't missed any cases.

thinkpad20 · on Nov 20, 2019

Pattern matching goes significantly beyond checking “foo.type”. It adds the ability to specify exactly (as exact as the language and your data allows) what case you’re dealing with. For example, a list of length three where the first and last elements are empty strings. Having to specify this with a series of if statements often leads to verbose and error-prone code, but it’s trivial in a language like reasonml:

    switch (mylist) {
      | [“”, _, “”] => x
      | _ => y
    }

Moreover, pattern matching usually involves pulling information out of the object at the same time. In the above example, maybe you want to do something with the middle element:

    switch (mylist) {
      | [“”, middle, “”] => middle ++ “!”
      | [“foo”, x] => String.reverse(x)
      | _ => “never mind”
    }

All of this would be possible to do with a series of if conditions, but significantly harder to read and implement correctly.

The power of pattern matching grows clearer as more complex data and rules are introduced. For example, it works just as well with nested lists:

    switch (mylist) {
      | [outer, [middle, [inner]]] => outer ++ middle ++ inner ++ “!”
      | _ => “never mind”
    }

While having a type system is helpful in all of the ways a type system is generally helpful, there’s nothing about this code which wouldn’t also be handy in a dynamic language like JavaScript. For further evidence of this see Erlang and Elixir, dynamic languages which make heavy use of pattern matching.

(Code typed on my iPhone so forgive the dopey examples)

ufo · on Nov 20, 2019

Nevertheless, those special cases are the exception, not the rule. Most of the time the pattern matching really is just a more convenient switch statement.

dllthomas · on Nov 20, 2019

I'm not sure what your experience is, but this seems to differ from mine (lots of recent Haskell experience, some dated OCaml).

I agree that matching on multiple levels in one statement isn't common, if that's what you were talking about. But performing bindings in a match is very common, and not something that's part of a switch statement (except in those languages where a switch statement is doing pattern matching more generally anyway).

ufo · on Nov 20, 2019

Yes, I was just trying to say that matching on multiple levels is rarer. I agree with you that automatically binding new variables is the best part of pattern matching.

thinkpad20 · on Nov 21, 2019

My examples were deliberately dumb, but I’ve certainly use this ability to pattern match at arbitrary depth innumerable times. It most commonly appears when looking at a tuple of N values, and I want to enumerate specific pairings of values.

Nevertheless, if that was the only benefit of it, one might argue it’d be more of a party trick. But it’s simply one more facet of a very powerful, simple-to-use technique for expressing how your code reacts to the shape of your data.

I’m curious how much exposure you’ve had to languages which support it — you might find it more useful than you think.

dllthomas · on Nov 21, 2019

I find it often winds up clearer to pull things into multiple case statements, especially as the code grows more complicated. That said, there may be a difference in what we 're meaning by "most of the time," so let's be more precise. IME, most Haskell programmers will encounter use cases for multi-level matches (beyond simply pulling the first couple elements off a list) a small number of times per week (maybe month, depending on the programmer), whereas they're typically writing a case statement some number of times per hour.

I wouldn't call it a party trick, but I also wouldn't say it's the most common use of case by a long shot. I don't expect that you disagree.

thinkpad20 · on Nov 22, 2019

I find that Haskell (both mine and that of others) involves a lot less pattern matching than ReasonML, since control flow is often expressed by type classes. Unfortunately, these are lacking in reasonml, and I think that might contribute to why you tend to see more explicit pattern matching there. But maybe that’s an incorrect impression. In either case, it’s not common all the time, but neither is it particularly rare.

paulddraper · on Nov 19, 2019

It's more than that though.

Scala example:

   myList match {
     case Nil => 0
     case first :: Nil => first * 3
     case first :: second :: Nil => first - second
     case first :: second :: rest => first * second + sum(rest)
   }

cies · on Nov 19, 2019

Seen ReasonML? Pretty cool. Does that with a JS-y syntax.

AnimalMuppet · on Nov 19, 2019

In what sense are integers and booleans sum types? What is the definition of sum types that booleans fit? Is it just that it can be true or false, and because there's the "or", it's a sum type?

naasking · on Nov 19, 2019

> In what sense are integers and booleans sum types?

Natural = Zero | Successor(Natural)

Bool = True | False

The mapping from Natural to Integer is standard.

smabie · on Nov 19, 2019

Assuming non-fixed width integers, how can you have an infinite sum type?

tel · on Nov 19, 2019

There’s nothing wrong with an infinite sum, in principle. You can’t usually define them unless you can use recursion. In the case of integers you can do

    data Int = Positive Nat | Negative Nat
    data Nat = Zero | Succ Nat

jolmg · on Nov 19, 2019

Though, that has 2 zeros, a positive and a negative one, but zero should be neither. To avoid that, you can do:

  data Int = Zero | NonZero Sign PositiveInt
  data PositiveInt = One | Succ PositiveInt
  data Sign = Positive | Negative

tel · on Nov 20, 2019

Good point.

arianvanp · on Nov 19, 2019

You can't. The Natural number definition combines a sum type with a recursive position. Or in other words, a finite tag (either zero or successors, 2 options) plus a pointer to the next value in the case of successor.

jolmg · on Nov 19, 2019

> Is it just that it can be true or false, and because there's the "or", it's a sum type?

You can rote learn that, but the reason why it's intuitive to call them sum types is because you're adding the number of possible values. With product types, you multiply the number of possible values that the type represents. Consider these types:

  -- 1 + 1 = 2 possible values
  data Bool = True | False

  -- 1 + 1 * a = 1 + a possible values
  data Maybe a = Nothing | Just a

  -- 1 * a + 1 * b = a + b possible values
  data Either a b = Left a | Right b

  -- 1 * a * b possible values
  data Pair a b = Pair a b

  -- 1 * a * b * c possible values
  data Triple a b c = Triple a b c

`Maybe Bool` would have 1 + 2 = 3 possible values.

`Either (Pair Bool Bool) (Maybe Bool)` would have (2 * 2) + (1 + 2) = 7 possible values.

Integers would be defined as:

  data Int = ... | -2 | -1 | 0 | 1 | 2 | ...

if they were finite, but since they're not, they'd have to be defined recursively with something like:

  data PositiveInteger = One | Succ PositiveInteger
  data Integer = Zero | NonZero Bool PositiveInteger

where the Bool represents the sign.

Since 2 * x is the same as x + x, we can also define it as:

  data Integer = Zero | NonZero (Either PositiveInteger PositiveInteger)

where Left of Either would represent negative integers, and Right of Either would represent positive integers.

tel · on Nov 19, 2019

Summing is the act of gluing two components together with “or”. It’s possible to see all types as sums in this fashion (this is essentially “distributivity of addition over multiplication) but it’s also impossible to see char/int/bool as not having some sense of “or” within it.

For a product type with no “or”, consider the JavaScript object {}. We see these as “atomic”.

juped · on Nov 19, 2019

A boolean is either true or false - i.e., it's the type inhabited only by "True" plus the type inhabited only by "False". An integer is either 1, or 2, or 3, or 4, or..., i.e., it's the countably infinite sum of other types.

TheAsprngHacker · on Nov 19, 2019

First, here is a note about the notation that I will use here: Let `1` denote the unit type, the type with one inhabitant. Let `()` denote the inhabitant of the unit type. Let `a + b` or `Either a b` denote the sum of types `a` and `b`. Let `Left` and `Right` be the constructors of the sum type.

Before defining the integers, one must define the natural numbers. Natural numbers are defined as an inductive type, which is a little different from an ordinary sum type:

    data Nat = Z | Suc Nat

Here is another way of defining them. Start by removing the self-reference:

    data NatF a = Z | Suc a

Notice that NatF is the same thing as the Maybe/Option type, AKA 1+a, AKA the sum of the unit type and a.

Given a "fix" type:

    data Fix f = MkFix (f (Fix f))

the definition of the natural numbers would be:

    type Nat = Fix NatF

An "F-algebra" is a way of defining some algebraic structure or data type as a map from a "sum of products" to the type. More precisely, an F-algebra consists of an "object" (type) A and a map F(A) -> A. For the natural numbers, the F is NatF and the map would be a function mkNat : NatF Nat -> Nat, AKA mkNat : (Maybe Nat) -> Nat or mkNat : (1 + Nat) -> Nat.

Related links:

- https://ncatlab.org/nlab/show/inductive+type

- https://ncatlab.org/nlab/show/natural+numbers+object

- https://ncatlab.org/nlab/show/initial+algebra+of+an+endofunc...

- https://en.wikipedia.org/wiki/F-algebra

---

One way of defining the integers is as the product of the natural numbers with themselves, or Nat * Nat, equipped with an equivalence relation ~ where

(a, b) ~ (c, d) := a + d = b + c

(a, b) can be understood as a - b. Of course, a - b = c - d <-> a + d = b + c.

Another way of defining the integers is as:

    data Int = Neg Nat | Zero | Pos Nat

where 0 = Zero, -1 = Neg Z, and +1 = Pos Z.

---

Booleans can be defined as 1 + 1. Left () = True and Right () = False, or vice versa.

roenxi · on Nov 19, 2019

I'm guessing at least a majority of the readers are like me and don't know Haskell, so on behalf of the confused:

  data Nat = Z | Suc Nat

^ What does this crazy construction mean syntactically?

It looks like it is trying to do something like (from your link) Coq's

  Inductive nat : Type :=
   | zero : nat
   | succ : nat -> nat.

But obviously defining natural numbers in terms of Z/the integers doesn't work because I could then call -1 a natural number.

jolmg · on Nov 19, 2019

You can't have negative numbers with that definition. Here are numbers and the way you would write them with that definition:

  0 == Z
  1 == Suc Z
  2 == Suc (Suc Z)
  3 == Suc (Suc (Suc Z))

You can define some arithmetic operators for it:

  data Nat = Z | Suc Nat deriving (Show, Eq)

  instance Num Nat where
    (Suc x) + y = Suc (x + y)
    Z + y = y

    (Suc x) - (Suc y) = x - y
    x - Z = x
    Z - (Suc _) = undefined

    Z * y = Z
    (Suc x) * y = (x * y) + y

    negate = undefined

    abs x = x

    signum (Suc _) = Suc Z
    signum Z = Z

    fromInteger 0 = Z
    fromInteger x
      | x < 0 = undefined
      | otherwise = Suc (fromInteger (x - 1))

The definition `fromInteger` is used by GHC to implicitly convert decimal numbers in the code to our Nat. We can use it like this:

  ghci> 0 == Z
  True
  ghci> 1 == Suc Z
  True
  ghci> 2 == Suc (Suc Z)
  True
  ghci> 3 :: Nat
  Suc (Suc (Suc Z))
  ghci> 3 + 2 :: Nat
  Suc (Suc (Suc (Suc (Suc Z))))
  ghci> 3 - 2 :: Nat
  Suc Z
  ghci> 3 * 2 :: Nat
  Suc (Suc (Suc (Suc (Suc (Suc Z)))))
  ghci> 3 - 4 :: Nat
  *** Exception: Prelude.undefined
  CallStack (from HasCallStack):
    error, called at libraries/base/GHC/Err.hs:78:14 in base:GHC.Err
    undefined, called at <interactive>:70:15 in interactive:Ghci8

TheAsprngHacker · on Nov 20, 2019

In my code, the Z doesn't mean integer; it stands for zero. Nat is an inductive type, and Z is the base case. Z has type Nat.

chas · on Nov 19, 2019

The “or” makes it a sum type.

AnimalMuppet · on Nov 19, 2019

Doesn't that make everything (other than "bottom" or "unit") a sum type?

Chinjut · on Nov 19, 2019

Sure, in the same sense that every set is a disjoint union over singletons of all its values. But whether you would or could naturally specify it that way or not depends on the set. You likely wouldn't sit down and specify the type "Function from natural numbers to Boolean" as a big OR over each possible such function, for example.

But yes, whether something is a sum type or a product type or whatever is not generally so interesting as a predicate about the type in itself. It's a relationship between this type and other types. The same way it is not interesting to say "7 is a sum" (everything is a sum); what's interesting is to observe "7 is the sum of 3 and 4".

[Disclaimer: In the same way that some topological spaces are connected and cannot be decomposed into separate disconnected components, one might sometimes reason that some types are connected and not reasonably represented as a disjoint unions over subtypes. So "Everything is a sum" is not always correct in all contexts. But let's ignore this sort of pedantry for now.]

AnimalMuppet · on Nov 19, 2019

Here's what's bugging me: Integers are a disjoint union of values, which is (in my non-abstract-algebra mind) something different than a disjoint union of types. That is, something like Maybe Integer is a "sum type" in a completely different way from the way that Integer is a "sum type". To me, this is different enough that a different term should be used.

I'm sure you're going to tell me that I'm not thinking abstractly enough...

tel · on Nov 19, 2019

There are sums at the type and value level. If I define Color = Red | Blue | Green, that’s not a use of a type which sums types together, but instead just is a type composed of a sum of values.

So a “sum type” can be any type which is the sum of different values. On the other hand, Either is a canonical “sum type (constructor)” which sums two types together (via the summation of their vaues).

Chinjut · on Nov 19, 2019

If it weren't for finiteness restrictions, you could write the following in Haskell and it'd make a natural number type that acted basically as expected.

data Zero = Zero data One = One data Two = Two ... data NaturalNumber = Zero | One | Two | Three | ...

Whatever you would say about sets, you might as well say the same thing about types; the distinction in terminology doesn't amount to much. The natural numbers are the disjoint union of {0}, {1}, {2}, {3}, ... . They're also the disjoint union of the even naturals and the odd naturals, or the disjoint union of {0}, {1}, primes, and composites. There are a million ways in which they are a disjoint union.

Chinjut · on Nov 20, 2019

Whoops, all the newlines were lost here, oh well.

danharaj · on Nov 19, 2019

I don't think it's your level of abstraction, I think it's that you are treating as unrelated two sides of the same coin. This sum type / variant / disjoint union / coproduct construction operates both at the level of types and the level of values. After all, we only care about types insofar as they tell us about values.

AnimalMuppet · on Nov 19, 2019

So Integer (0 or 1 or 2 or...) is a sum type in the same sense that Maybe Integer (Nothing or 0 or 1 or 2 or...) is. I see. That makes some sense.

So if I'm in Haskell, say, and I've got a Maybe Integer, and I'm trying to walk through the options (pattern matching), I usually see the options as Nothing or Just Integer. Can I instead do Nothing or Just 0 or... "Just some other Integer"? That is, can I treat... um, not sure how to say this in light of your post... "individual values of a particular named type" the same way as I can treat "different named types"?

danharaj · on Nov 19, 2019

Ah I see, so what you're talking about is the syntax for defining a sum type. This confused me too. In Haskell, a definition of a sum type looks something like:

data MaybeInteger = Nothing | Just Integer

This defines the type MaybeInteger by saying how to construct the values that inhabit it. `Nothing` is a value that is immediately of type MaybeInteger. `Just` is a way to construct a value of type MaybeInteger by feeding it an Integer. You use `Just` to construct a value by applying it to an Integer: `Just 4`, for example.

There is an alternative syntax for defining sum types that might be clearer:

  data MaybeInteger where
    Nothing :: MaybeInteger
    Just :: Integer -> MaybeInteger

When using sum types, we use a case statement:

  case x of
    Nothing -> "nothing"
    Just 1 -> "one"
    Just _ -> "something else"

coldtea · on Nov 19, 2019

  data MaybeInteger where
    Nothing :: MaybeInteger
    Just :: Integer -> MaybeInteger

Doesn't this mean that Nothing (ie. MaybeInteger) is the same "type"/"supertype" however you call it, as a constructor from Integer to MaybeInteger?

How does that work, e.g. how does MaybeInteger is equivalent to Integer -> MaybeInteger? I guess ending up at the same type is crucial, but one looks to be the type directly, while the other a constructor for the type.

And how can "Just" be a constructor for MaybeInteger here, but e.g. MaybeString in some other example? Is it specially blessed? What kind of thing it is in the language?

danharaj · on Nov 20, 2019

Nothing and Just are constructors. Constructors need not be the same type. The whole point of a sum type is that you can construct it in multiple ways, after all.

In this case Nothing is immediately a value of type MaybeInteger. Just on the other hand has the type of a function :: Integer -> MaybeInteger.

So they don't have the same type, but Nothing and Just 5 have the same type, MaybeInteger.

For pedagogical purposes I pretended to work with MaybeInteger instead of Maybe from Haskell. The definition of Maybe is:

  data Maybe a = Nothing | Just a

or alternatively

  data Maybe a where
    Nothing :: Maybe a
    Just :: a -> Maybe a

I didn't want to simultaneously explain polymorphism and sum types, but the way you can combine them is vastly more useful than either feature alone.

Quekid5 · on Nov 20, 2019

You can think of Nothing and Just as functions that construct a MaybeInteger.

Nothing takes no parameters because there's no value there, so why would it?

Just takes an integer and gives you a MaybeInteger containing the integer you supplied to it.

There is only a single type here, namely MaybeInteger.

AnimalMuppet · on Nov 19, 2019

Thanks!

WorldMaker · on Nov 19, 2019

It's rarely useful (though there are some interesting examples) but it certainly is possible represent the naturals entirely in the type system as Church encodings [1] or Peano naturals [2] entirely in the Haskell type system. You'll see some references to such things elsewhere in these threads.

You can have a "type" for 4 in peano naturals and use that as a type guard on a function that can only physically take the number 4, and the Haskell type checking system will enforce that.

Here's one very clear implementation of Peano naturals as Haskell sum type: https://github.com/kenbot/church/blob/master/PeanoNat.hs

It's rare that you need that granular of a type, though, and as pointed out in sibling response it's easy enough to do with runtime pattern matching of Integer values. But it does illustrate that type abstraction "can go deeper".

[1] https://en.wikipedia.org/wiki/Church_encoding

[2] https://en.wikipedia.org/wiki/Peano_axioms

Darmani · on Nov 19, 2019

Indeed, maps and sets cannot be constructed from sums, products and recursion. They can be approximated if you throw exponentials in there.

Instead, you can extend algebraic data types to combinatorial species, which do allow exactly the kind of quotienting you speak of. While they're unexplored in programming, they are believed to be programmable similarly to ADTs.

https://repository.upenn.edu/cgi/viewcontent.cgi?article=177...

I must take issue with the "objects are products" characterization you implied. Structures are products; objects need existential types: http://www.cis.upenn.edu/~bcpierce/papers/compobj.ps

tel · on Nov 19, 2019

Yeah, I hesitated there. But at the same time, many JavaScript objects are just named tuples. That intuition is ok.

arianvanp · on Nov 19, 2019

You can encode them as finger trees no? Which are algebraic

rectang · on Nov 19, 2019

.oO( Just when you thought you understood algebraic data types, a real functional programming expert arrives to show you that you never will. )

tel · on Nov 19, 2019

Don’t mind my nitpicking. A huge part of algebraic data types arises from products and sums just as discussed here. As is always the case, there is hairiness between metaphor and precision, but presenting these ideas in JavaScript like this goes a long way!

Quekid5 · on Nov 19, 2019

Please don't do the "I'll never understand this" thing! Sure, it's possible that you never will, but mostly it's just about trying and practicing... IFAIUI the current scientifically supported thinking is that a good way to learn is to approach the subject from many different angles. Something to do with encouraging your brain to form lots of connections from different concepts to the concept you're trying to learn...

I never thought I would understand point set topology, but for a few years I really did. (That was quite a few years ago, and since I haven't practiced... Well, it atrophied. Still remember my favorite proof, though! I also remember what a nightmare proof Stone–Weierstrass was :) ... amazing, but still a nightmare.)

rectang · on Nov 19, 2019

I actually have a decent practical understanding of sum types as they are used in Rust at least (including the tagged union implementation of enum). It's good enough to do useful work.

Every time that functional programming patterns get discussed, though, there is a mad rush by functional programmers to impress upon everyone just how complicated they are.

It is the same impulse that keeps the mathematical entries in Wikipedia accurate but utterly worthless for most people.

Quekid5 · on Nov 19, 2019

Absolutely, a practical understanding is enough. It also suffices for monads, monad transformers, etc.

> Every time that functional programming patterns get discussed, though, there is a mad rush by functional programmers to impress upon everyone just how complicated they are.

I... haven't observed this behavior personally. FWIW, I don't think that saying (for example)

> Hey, we can actually define what "|" and "(_,_)" means in a mathematical way

detracts in any way from a practical understanding. Can you elaborate on why you think it's detrimental? (I understand that it can come off as a little bit smug, but beyond that... meh?)

rectang · on Nov 19, 2019

Asserting that an article is "sprinkled with claims that go a bit too far" calls into question what the reader may have learned. It does not build upon the article, like a popularizer would, but first tears it down.

Then, to restore the reader's sense of mastery over the material, it's proposed that they grok a significantly more esoteric perspective.

Quekid5 · on Nov 20, 2019

I'm just trying to understand... is your beef here with the particular choice of verbiage?

(FWIW, I think it's absolutely right to call into question what the reader may have learned if an article is misleading. We can argue about wording, etc., but accuracy is important, I feel. Now, Lies-to-Children[0] certainly have their place, but they must be fundamentally accurate even if simplified.)

[0] I think I first heard of this concept in the Science of Discworld books...?

Jeff_Brown · on Nov 19, 2019

When I found sum types (in Haskell) I could not believe they had not been part of any prior language I'd learned. You can kind of fake them with inheritance but it's awkward, and you don't get totality checking, and you can't close the set of alternatives, so a new descendent added later can break things.

zozbot234 · on Nov 19, 2019

> I could not believe they had not been part of any prior language I'd learned.

Huh? They're part of Pascal (known as variant records) and plenty of other languages besides. It's mostly C that lacks them, and even then you're just expected to implement them yourself, for maximum efficiency.

reificator · on Nov 19, 2019

> Huh? They're part of Pascal (known as variant records) and plenty of other languages besides. It's mostly C that lacks them, and even then you're just expected to implement them yourself, for maximum efficiency.

"plenty of other languages besides" doesn't matter if those languages aren't being used in practice.

Let me list off the top 10 languages (incl markup) from the most recent Stack Overflow survey, specifically the Professional Developers tab. https://insights.stackoverflow.com/survey/2019#technology-_-...

* JavaScript

* HTML/CSS

* SQL

* Python

* Java

* Bash/Shell/PowerShell

* C#

* PHP

* TypeScript

* C++

You can argue about the demographics of Stack Overflow or the bias in users who actually respond on these surveys or a million other things. But based on job postings I've seen in the last several years, I'd wager this list is pretty accurate.

gimboland · on Nov 19, 2019

Yeah. You're not wrong. Sadly, while HN is full of people of great technical expertise, it's also full of people whose knee-jerk reaction to reading _anything_ here is to think of how they can say it's wrong, preferably in a way that implies or explicitly states that they're bigger experts than the person they're replying to. I find it very off-putting, personally, but I keep wading through the crap because (thankfully) that's not the whole story.

fauigerzigerk · on Nov 19, 2019

Swift is being used in practice and will most likely see more use in the future as Objective-C fades.

reificator · on Nov 19, 2019

If you follow the link you'll find Swift is already above Objective-C on that survey, just that neither of them made the top ten.

fauigerzigerk · on Nov 19, 2019

I did follow the link, but I disagree with your conclusion that being used by at least 6.6% of respondents to make apps for hundereds of millions of users amounts to "not being used in practice".

reificator · on Nov 19, 2019

If you take my comment in isolation, that's a reasonable read.

But it was a response to the following comment:

> > I could not believe they had not been part of any prior language I'd learned.

> Huh? They're part of Pascal (known as variant records) and plenty of other languages besides. It's mostly C that lacks them, and even then you're just expected to implement them yourself, for maximum efficiency.

With the context of that post, I hope you would understand that I am disagreeing with the insinuation that every programmer has used "Pascal [...] and plenty of other languages besides". You can easily learn five different languages and finally end up in Haskell without having encountered ADTs before.

akavi · on Nov 19, 2019

They're not present in the dominant statically typed languages (C/C++/C#/Java/Go) and not really a thing in dynamic languages generally, and so the vast majority of programmers never encounter the concept.

pmontra · on Nov 19, 2019

Well, Pascal has been dominant for a short time before C and its family really took off. But that was such a long time ago that I kind of forgot about it. C (without ++) was nice because you could write code without too much boilerplate compare to Pascal and it was noticeably faster (computers were slow.) However any mistake with pointers could crash the computer if it was a DOS PC and not a UNIX workstation.

thestoicattack · on Nov 19, 2019

C++ has std::variant, but that's pretty new.

Sniffnoy · on Nov 19, 2019

My understanding is that's not really a sum type, though -- e.g., you couldn't use it to add a type to itself, only to add distinct types to each other.

edflsafoiewq · on Nov 19, 2019

No, you can add a type to itself (that is a pain in the ass though). You might be thinking of the fact that it may not hold a value (eg. if an exception is thrown while moving into it), but they made it as close to a sum as they could given the nature of C++.

AnimalMuppet · on Nov 19, 2019

It's been a long time since I did anything with Pascal, but if I recall variant records correctly, they're not very different from C's unions. In particular, I don't recall variant records having anything like totality checking. I don't even recall it having anything that could tell you what kind of thing it currently holds. But as I said, it's been a long time...

saghm · on Nov 19, 2019

> They're part of Pascal (known as variant records) and plenty of other languages besides.

It sounds like GP is saying they had never used of those languages beforehand. I'm not sure why would be so surprising given that it doesn't accompany any other information like "and I had used 20 other languages before it".

Roboprog · on Nov 19, 2019

C has unions, which are similar to Pascal’s variant records, though without an explicit out of band discriminant.

LessDmesg · on Nov 19, 2019

Even Haskell doesn't get it right: there are no exhaustiveness checks on pattern matches (something OCaml has), and while sum types provide closed unions, getting open unions is clunky (have to use typeclasses). OOP languages are imperfect in the opposite direction: open unions out of the box (inheritance) but no closed unions and no pattern matching. Seems language authors on the whole are bad thinking design spaces through. "The user might want any of three simple things X, Y, or Z, so let's choose the one that our language will support well, the one it will kind of support, and the one it won't support whatsoever just for the hell of it".

sjakobi · on Nov 19, 2019

> Even Haskell doesn't get it right: there are no exhaustiveness checks on pattern matches

Just enable -Wincomplete-patterns or -Wall!

fauigerzigerk · on Nov 19, 2019

It's just a warning, and it's opt-in. I would say that counts as not getting it right.

whb07 · on Nov 19, 2019

Haskell does have exhaustive checks. Its just a compiler flag which isn't default turned on (this kills me).

manoDev · on Nov 19, 2019

Can you please elaborate with some examples what you mean by "closed unions" and "open unions" here? I would appreciate a lot.

kccqzy · on Nov 20, 2019

We say a sum type is "open" is more alternatives can be added later (or some alternatives can be removed later), perhaps in a different translation unit or a different library. We say it is "closed" if all alternatives must be specified at the definition, no alteration allowed.

For example if you write this in Haskell

    data PrimaryColor = Red | Green | Blue

    colorToInt Red = 1
    colorToInt Green = 2
    colorToInt Blue = 3

it is closed because future code can't add more alternatives. Instead you must simulate it using type classes, with a wildly different syntax, and (some would say) a hack. Or define a different sum type that wraps the above.

OCaml on the other hand supports the above perfectly with a slightly different syntax, but it also supports polymorphic variants (not to be confused with a Haskell sum type that's polymorphic because of a type variable like Maybe) which are lighter weight. For example you can write a function that takes different "constructors" without necessarily giving a definition for the type or listing all alternatives:

    let color_to_int = function
      | `Red -> 1 | `Green -> 2 | `Blue -> 3

Notice I never defined any type for the three colors! If you wish, you can mention the type:

    [< `Red | `Green | `Blue ]

or give it an alias to make it look more normal. But you can add (with some limitations) or remove more things to this variant later on.

To me, though, the most intuitive way of understanding this is to think of the first as nominal typing, whereas the second is structural typing.

Filligree · on Nov 19, 2019

They're included in Rust. But Rust of course feels very much like a strict-by-default Haskell in some ways.

(Now with optional laziness.)

uryga · on Nov 19, 2019

i had no idea rust supported laziness! how does that work?

cies · on Nov 19, 2019

I think he refers to async recently landing in Rust.

Here the Rust book mentions it:

> async bodies and other futures are lazy: they do nothing until they are run.

https://rust-lang.github.io/async-book/03_async_await/01_cha...

I just read it, and thought it was odd. I don't understand how Rust's async relates to a laziness as in Haskell's laziness.

steveklabnik · on Nov 19, 2019

Iterators are also lazy.

All of these things build up some form of computation that doesn’t execute until later. Until then, they’re represented by some kind of data structure.

madhadron · on Nov 19, 2019

I feel lucky that I learned Standard ML among the early languages I learned as a teenager, which got me into this mindset. That was way back before the monadic revolution.

ww520 · on Nov 19, 2019

C has Union, while not exactly the same as adt, the essential idea is the same.

gowld · on Nov 19, 2019

A C union is a sum, but not much of a type, since it's never checked (not even at runtime like Python) and using it wrong in your code will silently corrupt your program.

lidHanteyk · on Nov 19, 2019

Where does the author insert themselves into their imaginary social interactions? Are they the sneering functional programmer?

In particular, I wonder about their characterization of sum types. I had to search to make sure, but they never say the word "union". This is curious, because they do understand "tag" and "tagged" as words used to describe sum-type behavior, and because the classic way to explain sum types to "imperative" C programmers is as "tagged unions", or union types with a tag value that explains which of the union's constructors is present. C or Java programmers would instead prefer an analogy with subtyping, which the article proceeds to use, but by overlooking the tagged-union analogy, the author is actually missing out on both a useful analogy for reaching out to imperative programmers, and also how low-level implementations of functional programming languages usually implement sum types.

Pattern-matching is probably the killer part of ADTs, but only two paragraphs were spent upon the concept. Worse, it's implied that pattern-matching is tied deeply both to ADTs and to functional programming, when it occurs outside of those contexts. Languages like Python 3, Ruby, Racket, and Swift fall outside of the traditional "functional programming" traditions but still have interesting pattern-matching abilities.

The author still doesn't know Haskell. Want an effectful loop? Control.Monad.Loops [0] contains many prebuilt loops, and they're written in standard Haskell. The ST monad [1] provides imperative variables and mutation. So does IO [2], if one insists on avoiding GHC for some reason. I agree that it's good to learn an ML, and Haskell is a fine choice, but the author needs to realize how drenched in Haskell memes they currently are, and to either learn more of Haskell itself, or to take a break from it for a while.

[0] https://hackage.haskell.org/package/monad-loops

[1] https://wiki.haskell.org/Monad/ST

[2] https://hackage.haskell.org/package/base/docs/Data-IORef.htm...

uryga · on Nov 19, 2019

what do you mean by pattern matching in python 3? the only thing that sounds similar is (nested) iterable unpacking:

  (a, (b, c)) = foo()
  (fst, *middle, last) = foo()

which is nice but i wouldn't really call it pattern matching because i can't do something like

  match xs:
    when (): ...
    when (x, *rest): ...

i.e. do different things based on the shape of the data.

madsbuch · on Nov 19, 2019

My thesis is, that proclaimed functional programmers spend more time on mathematical structures than implementation details. Hence a sum type and not a tagged union (how the target evaluater distinguishes should not name the type).

rectang · on Nov 19, 2019

Software developers are allowed to have discussions about "integer" types without being constantly interrupted by number theorists to remind them that those aren't really integers.

Perhaps one day we will be able to have similar discussions about sum types without constant interjections by functional programmers.

madsbuch · on Nov 19, 2019

Yes indeed, as number theorists should be allowed to ditto.

privethedge · on Nov 19, 2019

There are also union types. And sum types can be implemented in terms of union types: https://waleedkhan.name/blog/union-vs-sum-types/

didibus · on Nov 19, 2019

A product type is just a closed group of values where you know in advance what will be grouped and what type the values are going to have that are part of the group.

A class in Java for example is a product type. As it describes a group of fields and their types. Same for a struct in C++.

Where I disagree with the article is that JavaScript does not have product types, because you do not have a definition of such a group of values in advance. A JS object does not list what values it groups and what type they each will have. A JS class comes closer in listing what values it will group, but not their types. Also, product types must be closed, and a JS class defines an open group, at runtime the object could group more then what it specified, unless explicitly frozen. You could say a frozen class defines a product of types where all types are the Object type and thus can be of any value, but that's a stretch, because product types are only as useful as they define a constrained product of possible values for the type.

A sum type is just a known list of possible types a value can take. That's the one people aren't as familiar with as most popular languages don't have a construct like it. It let's you say, this variable can be a String or an Int, but not both.

Often people say product type is AND while sum type is XOR. This variable contains a String AND an Int. They will both be there. So it contains both. That's a product type. This variable can contain a String XOR an Int, they can't be both contained, it is only one or the other. That's a sum type.

You can think in Java that all types are a sum type, they are either null or of their defined type. But null isn't a real type, more that it is an allowable value of all types. So it's a bit of a stretch.

Like the article said, best way to mimic them in popular languages is by extending an abstract class. Each type the variable can be will be a child of the parent. Thus a variable of type parent can be one and only one of its children. It's not full featured, you can't make existing types extend from the parent, so you can't arbitrarily define sum types using any available type. You also often can't list what are all the possible types of parent, depending on the reflection capabilities, knowing what all extend a type isn't always feasible. Also, the static type checkers don't consider these like sum types, so it won't tell you that you forgot to handle cases where parent is of one of its child type. Etc.

chapium · on Nov 19, 2019

https://nrinaudo.github.io/scala-best-practices/definitions/...

There is a lot of scala reference material regarding ADTs

hackermailman · on Nov 19, 2019

There's an old book Functional Programming: Practice and Theory by Bruce J. Maclennan to learn this if interested, it uses standard math notation instead of being restricted to teaching an existing programming language.

tyri_kai_psomi · on Nov 19, 2019

I found the best nugget of this entire article to be at the very end

> Keep persevering. Don’t give up. If you find that a bunch of blog posts don’t explain things in a way that makes sense, skip them. Keep looking until you find some well-written ones. The same goes for courses and tutorials. Everyone comes from different backgrounds. What makes sense for someone else might not work for you. And that’s OK.

I feel like this is advice everyone should hear, especially junior or mid level developers

didibus · on Nov 19, 2019

I actually think this is a great little series of articles.

Something I've found confusing with algeabric data types is the fact that they are not mathematical concepts. They originate as a programming construct and were invented as part of a programming language. They were then given a name that makes it sounds mathematical, but it was just an attempt at giving some kind of metaphorical meaning to them. Same as if I name a search engine Odin, because Odin was the God of wisdom. But if you thought it had anything to do with Norse mythology you'd similarly get confused.

People tried to find correspondence for them with mathematical theories. Thus, in Set theory they could correspond to Disjoint Unions of Cartesian Products. Or in category theory, they could correspond to Coproducts of products.

I guess you could say they came out of type theory, but I'm not sure of the history here fully. Did the extensions to simply typed Lambda calculus first came out of the theory or did it retroactively built a theory from the programming construct?

At any rate, my point is, these mathematical correspondence are very confusing. And I think it's wrong to try and understand these things from the mathematical angle unless you're a mathematician trained in one of the corresponding mathematical theories.

As such I think the article does a good job.

cbdumas · on Nov 19, 2019

While I don't know the history of the term "algebraic data type", I do know that the algebraic structure of ADTs is much deeper than what this article presented. For instance this structure supports some limited notion of a derivative[1]! And purely algebraic manipulation of ADT "equations" can actually be used to generate some non-obvious results [2]. So I think it's probably wrong to say that ADTs were just "given a name that sounds mathematical".

1. https://codewords.recurse.com/issues/three/algebra-and-calcu... 2. http://www.math.lsa.umich.edu/~ablass/7trees.pdf

didibus · on Nov 19, 2019

That's all true from the fact that there are real correspondences with math theories. But like, that's the whole point of applied math no? To find correspondences with other things so that we can then leverage and apply the math principles to that thing?

I'm sure there's quite many things in programming language where you can use such correspondence to your advantage, not just ADTs. But we don't teach it starting from the mathematical correspondence and then back.

To put it more simply, most programming isn't thought by first teaching learners about theoretical computer science. Generally it happens either in parallel or you learn the theory after the fact. But for some reason, when it comes to functional programming it seems most teachers want to start with teaching you the theory.

And my above point is that, even historically, it is not always true that theory came first. Often times, the construct were invented and used practically from finding solutions to concrete problems, and later a theory around it was developed. In the case of ADTs, I'm not 100% sure which came first, but I know they were first implemented in the programming language Hope. Not sure if the theory was there prior or not.

cbdumas · on Nov 19, 2019

I think you might be right in this case that practical use preceded theoretical understanding but that's not really my point. Software development certainly has examples of pseudo-math (for example there was a trend a few years back to write JavaScript that could run on both the server and the client. For reasons passing understanding, this was called "isomorphic" by some practitioners). All I meant was that ADTs are not that.

didibus · on Nov 20, 2019

Ah I see. I believe you are correct. The correspondence seems to hold quite well, and even the name algeabric was pretty aptly chosen in this instance.

I'd be curious to know if there's an original paper on ADTs somewhere about it.

gowld · on Nov 19, 2019

What makes you say it's not mathematical?

https://math.stackexchange.com/questions/50375/whats-the-mea...

didibus · on Nov 19, 2019

I actually prefer the second answer in that stack exchange, the one from Gilles. It basically says what I'm saying:

It is not that there are no mathematical correspondence, but ADTs are not constructs of any mathematical theories. There's no ADT in set theory. There's no ADT in category theory. The only place you might find them is in type theory, but that's unclear to me if that is the theory that defined them, or simply the theory used to define them mathematically after their invention in computer science. And like Gilles answer suggests, ADTs brings some concepts that complicates the mathematical formalism for them.

This is why the whole sum type and product type thing is loosy washy. You can say Java classes are product types, but they are not quite like Haskell's product types, etc. They aren't an implementation of a very well defined mathematical theory. Unlike say how Java and Haskell both have support for set theory and algebra built in.

TheAsprngHacker · on Nov 19, 2019

> There's no ADT in category theory. The only place you might find them is in type theory

I wouldn't say that this is true. Category theory has a definition of product of objects, sum/coproduct of objects, terminal object (unit type), and initial object (empty type). There is a correspondence between type and category theory where a certain type theory is the internal language of a certain category: https://ncatlab.org/nlab/show/computational+trinitarianism

Product: https://ncatlab.org/nlab/show/cartesian+product

Coproduct: https://ncatlab.org/nlab/show/coproduct

Terminal object: https://ncatlab.org/nlab/show/terminal+object

Initial object: https://ncatlab.org/nlab/show/initial+object

ww520 · on Nov 19, 2019

Union is a better example than Enum to explain what algebraic data type is. It’s a natural progression from Struct as product type to Union as sum type.

leitasat · on Nov 19, 2019

it should have been called 'How to implement Enum in Javascript'

jplayer01 · on Nov 19, 2019

Okay, maybe the design isn't great. Can we talk about algebraic data types now? I come to HN for interesting discussion about interesting topics. I start to wonder why I bother when the only comments on a post like this, which actually concerns something that interests me, are complaints about design.

scott_s · on Nov 19, 2019

(If you think comments have gone off-the-rails, I find the best solution is to downvote the off-topic top level comments, and then make a substantive on-topic top level comment.)

jplayer01 · on Nov 20, 2019

I don't have anything interesting to say about ADT's. Or even many topics I read about. I'm mainly here to learn from others.

scott_s · on Nov 20, 2019

Then just downvote, and come back later.

jplayer01 · on Nov 20, 2019

I felt there was value in pointing out yet another comment about the design was unnecessary. Also, I was annoyed.

scott_s · on Nov 20, 2019

I contend that not only was there no value, it added to the problem. Hence my suggestion.

trevor-e · on Nov 19, 2019

While you do have a point, there's a very simple way to collapse comment threads that you don't care about. And instead of commenting on an existing thread about the design (sparing those who have already muted that), you chose to create your own comment thread and further clutter the post discussion.

jplayer01 · on Nov 20, 2019

When I commented, there were only comments about design. Collapsing that gave me an empty page.

pleasecalllater · on Nov 19, 2019

[flagged]

frou_dh · on Nov 19, 2019

https://news.ycombinator.com/item?id=9238739

Koshkin · on Nov 19, 2019

It's not just the typesetting.

jplayer01 · on Nov 19, 2019

While we're on the topic, why do sites have non-wrapping anything? That site has non-wrapping code blocks, which made it so tedious to read on my phone that I gave up. HN has non-wrapping quotes. In what universe is non-wrapping anything a good idea?