Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A Badass Way to Connect Programs Together (joearms.github.io)
351 points by norswap on Jan 26, 2016 | hide | past | favorite | 145 comments


I think OSC is interesting, but Armstrong goes too far in his disparagement of XML and JSON:

> I think it is totally crazy to send JSON or XML “over the air” since this will degrade the performance of the applications giving a bad user experience and higher bills - since ultimately we pay for every bit of data.

If every JSON and XML API switched to OSC, the difference in data usage would scarcely be measurable. First: The vast majority of mobile data is video, audio, images, and application updates. Second: When sent over the wire, JSON and XML are typically compressed with gzip or deflate.

And let's not forget: JSON and XML have some significant advantages over most other serialization formats. They're ubiquitous. Practically every programming language has libraries to parse and generate them. Practically every programmer knows how to work with them. They're human-readable. No special programs are needed to view their contents.

In other words, JSON and XML are only "wasteful" if you don't count developer time. Once you do, it suddenly makes a lot of sense to "waste" hardware resources. After all, programmers are expensive. Hardware is cheap.


> And let's not forget: JSON and XML have some significant advantages over most other serialization formats. They're ubiquitous. Practically every programming language has libraries to parse and generate them.

One of the interesting things about Erlang is it has binary pattern matching. Matching any binary in it is just as trivial or even more trivial than parsing json in other langauges.

Here is what an IP packet parsing might look like:

    <<4:4, HeaderLength:4, _DiffServ:8, _Length:16, _Identification:16,
      _Flags:3, _FragOffset:13, _TTL:8, Protocol:8, _HeaderChecksum:16,
      SrcAddr:32/bits, DstAddr:32/bits, OptionsAndData/bytes>>
It looks almost like casting a binary blob to a C struct, but you can do even crazier things like take Length and match it later in the pattern, if it specifies a length prefixed binary, for example.

> the difference in data usage would scarcely be measurable

It depends. See, Joe's background is from Ericsson. Their stuff controls access of 50% of world's smartphone <-> internet traffic. When people say "but who uses functional languages in the industry?" the answer is if functional languages won't work, chances are high, you won't see picture of cats on your smartphone.

But the idea is that mobile bandwith is still a precious resourse. Even if our CPU and hard drives are getting bigger. That is one shared resource that is expensive. Maybe if a 1M clients use JSON vs a binary protocol doesn't make a difference much for some startup, when we talk about billions of connected devices talking over radio waves, those things add up. Well, also, game servers use UDP and binary often because of latency. Facebook knows a thing or two about chatting with lots of users, so they use Flatbuffers https://code.facebook.com/posts/872547912839369/improving-fa... that's binary as well.


While very informative and interesting, I don't think your comment really engages with my point. I never mentioned Erlang or functional programming. I'm just talking about data serialization. Namely, that sending JSON or XML over the air isn't "totally crazy".

Again, the vast majority of mobile data isn't JSON or XML. If mobile bandwidth was a Californian drought, video would be alfalfa farming. JSON would be taking a long shower: maybe you feel a little bad, but it doesn't matter. All the short showers in the world can't make up for the huge volume of water (bandwidth) used by alfalfa (video).

Obviously, if you have a billion users, trade-offs change and it can make sense to use esoteric or bespoke protocols. But even for Facebook, using JSON isn't crazy. It's just inefficient. And Facebook didn't switch to FlatBuffers for the bandwidth savings. Their primary goal was to improve loading times for their local cache. Had FlatBuffers required 10% more network traffic than JSON, they likely still would have made the switch.


> I don't think your comment really engages with my point. I never mentioned Erlang or functional programming [...]

Sorry. It was mainly a response to "JSON and XML are only 'wasteful' if you don't count developer time. Once you do, it suddenly makes a lot of sense to "waste" hardware resources. After all, programmers are expensive. Hardware is cheap."

And pointing out that it really depends on what languages or type or programmers you use and what "hardware" is. In some languages binary parsing is really easy and just as cheap and elegant as using a json parser. And well, Joe is Erlang's father, so point out Erlang was obvious choice.

As for hardware, bandwidth is not hardware. It really is "something else" -- a shared resource usually. And if there are enough devices using this even small changes start to make a difference.

Think a bit like a nested loop -- a small optimization in the inner loop might have large visible benefits. Or say, a huge database with 10B records, choosing an 32bit int vs an 16bit one for a column will make a for a good difference in size. That is basically where Joe was coming from. Really valuing bandiwidth as a shared resource. (He also likes to talk about protocols as well quite a bit).

> Again, the vast majority of mobile data isn't JSON or XML

The important data is JSON though, the control message, the pages that load (and they are slow), stuff that stops, start, manages everything all that is JSON.

> If mobile bandwidth was a Californian drought, video would be alfalfa farming

Hmm, video is 50% of all traffic on smart phones. (http://www.ericsson.com/res/docs/2015/ericsson-mobility-repo... says 45% in 2014). Not that the rest is all JSON but, I wouldn't say it is all just "taking showers" type of equivalency, it is more like grapes, almonds, strawberries, and walnuts.


Remember to also subtract images, audio, and other actual content and code from the remaining 50%. And realize that if you're using a flat structure to encode hierarchical data, you'll have to encode that hierarchy somehow. Then subtract compression.


My money is on the fact that ad blockers save several times the bandwidth per user that could possibly be saved by a switch away from JSON/XML.

Ignoring everything else there are much lower hanging fruit to save bandwidth.


Fwiw it seems like Fb moving away from Json had little to nothing to do with size of the data.


I think it's also important to note that json isn't exactly something that can be streamed while parsed.


From personal experience, I can tell you that is can. The i3 window manager in Linux comes with a status bar program called i3bar. It also includes a program called i3status which constantly feeds i3bar information to display via JSON through a standard pipeline.

I wrote an i3status replacement that does exactly this. It sends JSON to STDOUT at a set time interval. i3bar takes the JSON as it comes through STDIN and changes the look of the bar as it's read.


There are multiple libraries doing just that. For example https://sites.google.com/site/gson/streaming


JSON and XML can also be wasteful on developer's time. I've seen a lot of time wasted in organizations where JSON was used as the communication medium between microservices (hundreds of them). Lots of time spent implementing parsers, watching out for changes, etc (of course, there's many process & culture "solutions" for these problems). Most of this goes away easily when you adopt a binary/IDL-based format (in the latest case I've been through, Thrift)

JSON/XML are the "dynamic language" of the transport protocols: flexible and quick to prototype, but can become very hard to "refactor" and performance suffers, as the thing grows.

As for public APIs, obviously agree with your point - it's a lot easier to roll out a JSON interface as parsing libraries are widespread, etc etc


I don't see how using JSON/XML or not has anything to do with the problem of communicating changes. The problem isn't the underlying serialization format, but what kind of schema validation, versioning and type definitions you build around it and how you make sure that all consumers know or may find out about it. Interestingly you mention Thrift, which can seamlessly use both XML and JSON as its underlying protocol.


Thrift and Protobuf both support JSON as transport, yes - it's useful when dealing with clients that can't support the binary encodings (eg browsers, last time I checked). It's an added benefit: you can use the fast encoding between clients that support it, but can fallback as needed.

The difference between communicating changes by setting a bunch of rules on how to do versioning/structure/etc vs changing an IDL is akin to verifying a program structure with unit tests and code reviews vs enforcing them with a compiler (sure, you can always use stuff like Json Schema or XSLT - personally never had a good experience w/ those). The latter is far easier, in my experience.


Fully agree, the human readability of json and xml is "fake productivity", it tricks you into thinking the format is easy and invites you to do hacks like manually writing the serialization and only helps you short term.

I've worked in places where binary serialization + deserialization functions were generated automatically by just adding a field and the bit-mapping into a message-spreadsheet. This was way more productive than misspelling the property name of a json-field between serializing and de-serializing and spending a whole day troubleshooting that. This also automatically documented the format instead of sending an example xml-file to your counterpart who is to implement loading of the data elsewhere and saying "the data will look aproximately like this". Viewing the data for debugging was done with a special program (similar to wireshark) that understood the protocol and rendered a much nicer view of the data than pasting a json-file into notepad++, there you need indentation and highlighting plugins anyway so the number of programs you need installed is not an argument either, streaming data through notepad++ doesn't work so well either if you want to view data on the fly.

Security is also one aspect, at every place i've worked there is always one smartass that thinks xml can be created by string-concating, that's impossible if the serialization-functions are totally encapsulated.

All this isn't a problem inherent to xml/json as it can all be solved with proper processes, it's just that it invites to that type of culture.


> After all, programmers are expensive. Hardware is cheap.

Except in China. I worked at a US company that did almost zero optimization on their code, deploying to 5000 AWS instances with a team of about six. In China, bandwidth closely followed by CPU were the cost centers, and the Chinese team of 30 engineers optimized the same code to run on around 50 (admittedly beefy) servers.


Doesn't invalidate the point; another EC2 instance is a few dozen bucks a month to run full time; another developer, even overseas, is going to cost you more, at just 40 hours a week.

But that said, it's rather a good thing to point out for the OP, as Joe Armstrong himself tends to value programmer time over a machine's, and that micro-optimizations are rarely ever worth it (as opposed to yours, which sounds like there were huge, glaring algorithmic issues to fix with macro optimizations, complete replacements of algorithms rather than tightening up code, careful profiling and tweaking to get a 30% speedup in one critical section, etc)


I guess your response brings out the fact that it's not an either-or proposition. It's a sliding scale between dev resources and computer resources, based on the costs in your situation.


Yes, the exact tradeoff has to be calculated per instance, exactly how much programmer time would it take to reduce how much machine time.

However, getting X hours of average programmer time, vs X hours of average machine time...machines are cheap. You should default to optimizing programmer time, not machine time.


4950 instances times a few dozen bucks a month, is like 2 million a year, just saying.


You missed the point. I was just pointing out that 1 unit of hardware, compared to 1 unit of developer, is cheap. That was what the OP was commenting on.

Obviously, 4950 units of hardware, repeated monthly into perpetuity, compared with 30 units of developer, for a fixed length project, is not cheaper.


30 developers in a cheap country cost less than 2 million per year I think. I'm not 100% of their wages and taxes though


And then you notice that companies that have to be profitable care, while that are just burning through VC money don't.


This matters for VC funded companies too.

I just moved a customer off EC2. A small startup. I've billed them about $20k for the work. It will take them ~2.5 months to repay in saved hosting costs at current load levels. But if they're still at current levels in 3 months time, something is wrong. On top of that their ops costs have dropped as we have more control over the environment.

Basically, with rapidly growing hosting needs, getting a more cost effective setup was a matter of survival: They'd be unlikely to close another round of funding in the next few months if they didn't get that cost under control.

I wonder how many startups fail because they don't understand how to get their hosting costs under control, as it's way too common that I see developers that seem to think that servers are basically free, and managers that have no clue they need to seriously question why developers are making the server choices they are making..


Oh, it definitely should matter.

But, as we see with Twitter, it often is completely ignored.

Luckily WhatsApp gave some inspiration for companies to reverse this ideal.


I think part of what he's getting at is that if things embrace simplicity in their design, suddenly fact that "every programming language has a json parser" becomes less compelling, because it's easy to write this stuff yourself.

Perhaps he goes a little bit far, but I'd agree that shoehorning your data serialization problems into a ubiquitous format can be a real headache.


> When sent over the wire, JSON and XML are typically compressed with gzip or deflate.

…which probably amounts to CPU-years of wasted time and electricity, considering how ubiquitous this is.


Even with binary protocols, compression is worthwhile. No matter your serialization format, most of the bytes transferred are going to be payload. And most of the time, that payload is going to have redundancies. Also, modern implementations of decompression algorithms are very efficient. Mobile SoCs can approach 100MB/sec for deflate. In other words: If your phone downloads 10GB of compressed data a month, it's going to use an extra 3-4 seconds of CPU time per day. That's a rounding error. In fact, it's quite possible that the radio power savings outweigh that extra CPU usage.


The problem with compression is, it's everywhere.

Sending a jpeg, over compressed http(s), throught a compressed and encrypted VPN is too much compression. Then the client may save the jpeg to a local compressed NTFS cache... This is just wasted performance, plus compressors are a common place for bugs, so this is also a security issue.

Just stop adding compression to everything and think about it first, and then Mobile phones and modem lines may very well be an exception.


This could make for a really interesting data visualization.

There are probably lots of things we do as programmers that are insignificant in the context of our own systems, but are so culturally ubiquitous that they add up to an enormous impact globally. I wonder how even things like different application architecture would consume different amounts of power. Or,is there a side effect of shifting computation to the client side of causing more energy to be consumed from dirtier sources, if, say, data centers tend to draw from cheap and clean hydro power.


The other question about compressed JSON/XML, is how may clock cycles are getting spent on this that wouldn't be otherwise. Which on mobile devices is going to translate to power consumption.


You probably would be surprised to realize that sane engineers nowadays are using this or that variant of "protocol buffers".

Also there is a theoretical knowledge that nothing could beat messages of fixed-length headers with adjustable-length payloads of binary tagged data.


Compression only works well if there is a lot of repetition of key names in the stream being compressed. If you are sending individual objects it may not be able to do much. And compression will never be able to eliminate brackets and commas.

You are right that for the majority of applications json is fine.


> And compression will never be able to eliminate brackets and commas.

Isn't that the whole point of compression: to minimise repetition of common features in data?


This'd be worth testing. Isn't the way gzip works to build up a dictionary of characters/phrases and then store pointers to the dictionary? I'm not sure you can compress { or } but }); is another story.


I just ran that experiment: https://gist.github.com/ggreer/4eb3ad61e97926e559f6

ASCII is extremely repetitive. 1MB of random braces gzips down to 163KB. That's a compression ratio of 6.1:1. Optimal compression would be close to 8:1 (since each byte really contains only one bit of information).


for (let i = 0; i < 1024 * 1024; i++) { buf += Math.random() > 0.5 ? "{" : "}"; }

Your sample data is bunk for determining the compression characteristics of braces in a realistic json file.


The point was to show that compression works at the level of bits, not characters. Realistic JSON is mostly content, which means your compression ratio depends more on content redundancies than braces or quotes. So too for binary serialization formats.


Gzip uses DEFLATE, which uses Huffman coding.

Huffman coding works by replacing common symbols with shorter codes and uncommon symbols with longer codes.

The codes used are therefore variable-length.

The codes are created in such a way that no code is the prefix of any other code.

That way, the decoder is able to know when it has reached the end of a code without the need for extra information other than the Huffman tree, which tells the decoder which codes belong to which symbols.

The Huffman tree can be pre-agreed or, more commonly, included with the compressed data.


Major supporter of this thinking, and did several sketches in this direction a few years ago: http://www.illucia.com/

Made of: 1) a suite of interconnectable videogames & apps and 2) physical patchbays to connect them

Came from thinking about software like modules in a modular synth. Playgrounds of interconnectable control structures. Was patching games into samplers, slaving text editors to drum machines, etc.

related writeups: http://www.illucia.com/faq/ another from 2010 or so when I was just starting coding (so a lot of it feels silly now.. but still some related nuggets): http://www.paperkettle.com/codebending/


I stumbled over your projects a few years back. Awesome work!


That is really cool!


Side Q: can I ask what typeface you used for illucia?


It's 'Square Serif Medium' (Look at the css, or do Right Click => Inspect Element in your browser)


Side A: I use http://chengyinliu.com/whatfont.html to quickly check the fonts on a page with Safari, other browsers may have similar tools too.


Ah OSC, my first encounter with it was with the Make Controller[1] (basically an ARM chip with an OSC service endpoint). OSC always struck me as one of those protocols that did something really useful in the wrong way. It just "feels" awkward to me for some reason and I never quite put my finger on it.

That said it was great fun building a robot with the controller but hard to get asynchronous feedback. If I were doing it again I'd try to figure out a full duplex OSC channel for something like that.

[1] http://makezine.com/2006/05/11/make-controller-kit/


> This is wonderful news for systems like Node.js whose concurrency model is non-existent and whose idea of having a fun time is throwing promises into the future and hoping that nobody breaks them.

Oh so true. I wish I learned Erlang (and to a lesser extent Go) much earlier in my career after wasting so much time with Node. I now laugh when I see people on Twitter/Reddit talk about Node's concurrency as a reason to learn it over other languages.


The concurrency model in Node is essentially the reactor pattern. While there are lots of concurrency patterns that you should learn, reactor pattern is also one of them. The big thing to keep in mind is when you should use it and when you should not use it. After you spend time learning about concurrency in Erlang and Go (and practicing it), I recommend doing a mental kata and thinking, "When would the reactor pattern be more appropriate than this." If you answer is "never", then you should go back to Node (or simply implement it in whatever language you like) and practice until you know what it's good for. (Hint: it is quite useful when implementing UIs or anything else that has a series of messages/events that have to processed in linear order. If you wanted to do something else at the same time while having full control over how much resources you are using...).


A few places I could see using the reactor pattern over the "concurrency unit (i.e. green process, thread, co-routine) per connection context" is in very high performance scenarios and where callback chains are very short.

Think proxies (haproxy), routers, forwarders, web servers (nginx) etc. Where memory context per connection should be minimal, and everything should be as close as possible to the select/poll/epoll loop.

Funny enough, this also includes demos, quick example scripts, and benchmarks. I wonder if that what hooked most people to the reactor pattern -- small examples in Twisted, Node, etc, look pretty easy and simple. But when you start adding business logic and callback chains evolve into callback/errback trees 10 levels deeps when things get very scary.


You make a very important point here. When you're creating a very high-throughput network server, you want to use as little Reactor and proactor I/O makes it very easy to reason about execution order and more importantly memory usage. This kind of code usually gets written in C (or at the very best C++), and minimum memory usage per-socket is carefully planned.

Co-routine context and and green thread stack size can be tweaked and optimized sometimes, but if you want to have precise control over memory allocated per-connection, reactor/proactor is hard to beat.

Besides, to be 100% honest, C and C++ just doesn't have native support for any other concurrency model. That's probably the main reason why Node.js was designed to use callbacks. I'm pretty sure it would use generators or async functions if it was designed today.


Funny thing about "high-performance scenarios" is that the reactor pattern is often slower than other patterns. If you want to use the reactor pattern, you have to use asynchronous IO. But, asynchronous IO involves making more syscalls. Blocking IO is actually rather fast. The performance is differences are going to depend on a lot of particulars.


If someone has told you synchronous I/O is slow, they've obviously misunderstood the entire C10k argument going on for a while.

The problem with synchronous I/O is not speed, but blocking. Blocking means the only possible concurrency model is based on processes or threads. Consequentially, most performance problems with synchronous I/O stem from the thread model: context switching, synchronization, memory overhead, thread-pool starvation and so on.


> Blocking IO is actually rather fast.

Right. Yeah there was a presentation about Java concurrency patterns about that. Basically dispelling the myth that is everything non-blocking and asynchronous will be faster than the old school blocking thread / socket.

Interestingly, I believe, haproxy and nginx use a hybrid model. They have a worker thread per CPU, all listening to the same socket using epoll from multiple threads! When data arrives they all get woken up and then use a shared mutex to decide which one will handle the request. (Now later kernels fixed that one can have exclusive wake-up across the same socket).


Nginx implementation may have changed since the last time I delved into it (and I delved pretty deep), but at least 2 years ago that was not the case. Nginx did have some vestiges of an abandoned multi-threaded implementation attempt from the 0.x days behind compiler flags, but beyond that it was thoroughly single-threaded, but there is a master process which forks itself into multiple processes.

There is a so-called accept_mutex, but I'm pretty sure you can't avoid that if you want to have multiple cores handle connections from the same port. Even Erlang would have to do that somewhere behind the scenes. Newer Linux kernel versions support the SO_REUSEPORT which is meant to address this situation - I guess this is what you're referring as exclusive wake-up accross the same socket?


Interestingly, Java 7’s "native non-blocking IO" actually uses exactly that – you have a ServerSocket, and, if a user connects to the server, it spawns a Thread and gives it a normal SocketChannel.


This completely depends on the kernel.


> proxies (haproxy), routers, forwarders, web servers (nginx)

...all stuff that people should not implement in javascript in the first place.


Someone at a place I used to work got the Node.js fever and stuck node's http proxy in front of our public facing servers. That was an epic disaster from multiple points of view: stability, performance, debuggability etc.


Or you use promises/observables (with arrow functions) or generators or async functions, and they don't. 2014 called, they want their node problems back.

Basically almost none of the criticism in this whole thread is true today. The only remaining true bit is that node still relies on cooperative multitasking and one of the tasks can hog the CPU of a single worker. Which isn't very good, but still, way better than say, the good old Rails 1 request per process model.

Someone should probably do a proper benchmark to show how different numbers of workers at different CPU workloads affect a node service's response time / latency. Especially with multiple processes (cluster), I would bet the effect would be much better than what people expect it to be.


We should remember to distinguish between concurrent, parallel, and asynchronous programming. UI is a bad example for the reactor pattern, since the reactor pattern is for concurrent processing and UIs should usually process events sequentially (but asynchronously). That's just a bog-standard event loop, not the reactor pattern.

"Practicing" different concurrency patterns sounds like a lot of tedious effort, and the suggestion to "go back to Node" seems a bit high-handed to me. I would encourage everyone to pick and choose what expertise they want to develop.


Actually UIs process data synchronously, that's why - Qt for example doesn't need to std::mutex everything.


When I say "asynchrounous" I'm talking about the arrival of events. It's not a particularly well-defined word, though.


> The concurrency model in Node is essentially the reactor pattern. While there are lots of concurrency patterns that you should learn, reactor pattern is also one of them.

A usable reactor pattern requires green threads (or threads) . Node.js does classic IO multiplexing (i.e. what has been available in C since the introduction of select()). It's not bad, but don't delude yourself into believing node.js does anything new.


Um... you can get the "reactor pattern" (I really hate SV rebranding of stuff that's existed forever by another name) with gen_event and gen_server in Erlang.

The pattern isn't special or strongly correlated to node.js


Not sure how long is your 'forever', but FWIW, the name Reactor Pattern is at least as old as the ACE network library (which if not invented at least popularized it), i.e. circa 1995.

That predates the open sourcing of Erlang by a few years.

Also ACE didn't originate in SV.


I started programming with Node a year ago and currently diving into Erlang/Elixir... I am hopeful/afraid that I'll reach the same conclusion!


To be fair, Erlang escaped from the lab in 1988, and only supported SMP in 2006. Node.js was released in 2009, it's only had 6 years, maybe it's still a bit early for SMP?


You can use SMP just fine in Node.js today with https://nodejs.org/api/cluster.html.

The problem with Node.js and concurrency is that everything depends on trusting code to be perfectly written, and that perfectly written code must be written in a naturally confusing callback style that is really easy to screw up.

If you do it perfectly, then you get great scalability. But one bonehead mistake will ruin your concurrency. By contrast with a pre-emptive model you get decent scalability really easily, but now you've got a million and one possible race conditions that are hard to reason about.

This is not a new design tradeoff. Go back to the days of Windows 3.1 or the old MacOS versions below OS X. They all used cooperative multi-tasking, just like Node.js. Today what do we have? Pre-emptive multi-tasking in Windows, OS X, *nix and iOS.

Web development has actually gone back to a model that operating systems abandoned long ago. As long as your app is small, there is a chance that it will work. But as your app grows? Good luck! (That is why operating systems uniformly wound up choosing hard to debug race conditions over predictably impossible to solve latency problems from cooperative multi-tasking.)


You can totally run a cluster of Node instances. But each instance can only do one thing at a time. No multiprocessing, you have to spin up a new VM for each thing you want to do.

You also don't get shared memory concurrency when it's beneficial, and you have to speak in callbacks.

There are better langs out there for concurrent web. Erlang and Haskell+Warp are fantastic, for instance.


> Node.js [...] whose idea of having a fun time is throwing promises into the future and hoping that nobody breaks them.

As a non-fan of Javascript as she is wrote I'd hear this criticism and nod in almost any situation, but not while the author is simultaneously advocating writing your stack as a conglomerate of processes written in different languages communicating over UDP.


The basis of the protocol is similar to that of protocol buffers [1]. Though the tag field containing the map/list of segments in data is pretty novel IME.

[1] https://developers.google.com/protocol-buffers/docs/encoding


It's not really that much like protobufs. I mean, pretty much all wire formats have some things in common, true.

Protobufs depend, for interpretability, on a schema. That is, if you don't have the schema file (which is normally communicated out-of-band, e.g. by source control), you know almost nothing about what the wire-encoding should decode to.

Protobufs use variable-length integers (varints), both as a datatype for the payload, and as an essential feature in understanding the encoding.

The wire keys in protobufs don't fully specify the datatype. Instead they include 3 bits to tell you what length-scheme is used, and some other bits (variable number of bits, because varint) to tell you which field in the schema will tell you the rest of the datatype for this field (as well as the key-name of the field).

Protobuf is a rich protocol for doing potentially-complicated things. This OSC thing is very nearly the dumbest thing that could work. Indeed, Armstrong's post makes a point of noting that. To him, that's a feature.


Message Pack[0] is another protocol that follows this very successful simplicity.

A leading thought in a lot of this is the "unset and a zero value are equivalent". It simplifies a lot of what the serializer and deserializers have to understand, and places the burden on code generation or libraries rather than complex serialization. It was unnatural for me to think this way, as I take the opposite for memory (record whether something is set or not, don't just assume it equals zero), but on the wire this makes for very compact and backwards compatible messaging.

Cap'n Proto[1] I have not played with, but would love to learn more about to see the advantages and disadvantages between protobuf, msgpack, and capn.

[0]: http://msgpack.org/

[1]: https://capnproto.org/


For a standardized version of Message Pack, see CBOR (RFC 7049): http://cbor.io/


What makes it a standardized version of Message Pack?

Message Pack has a specification that is well documented[0] and even the rfc you refer to specifically states it has different goals (though, eliding specifics with pretty sad generalizations)[1]

[0]: https://github.com/msgpack/msgpack/blob/master/spec.md

[1]: https://tools.ietf.org/html/rfc7049#section-9


> ...and even the rfc you refer to specifically states it has different goals (though, eliding specifics with pretty sad generalizations) [link to RFC7049, section 9]

You find those generalizations sad because that is the acknowledgements section!

If you start from the beginning, you only need to read three paragraphs to see that "Appendix E lists some existing binary formats and discusses how well they do or do not fit the design objectives of the Concise Binary Object Representation (CBOR)." [0]

If we look at Appendix E, [1] we find that section E.2 contains an explanation that you're likely to be more satisfied with. [2] However, before you go off and read section E.2, I strongly urge you to read the couple of paragraphs in Appendix E, first. The authors of RFCs generally tend to try hard to remove redundancy in their prose, and later sections often elide information covered in earlier sections.

[0] https://tools.ietf.org/html/rfc7049#section-1

[1] https://tools.ietf.org/html/rfc7049#appendix-E

[2] https://tools.ietf.org/html/rfc7049#appendix-E.2


The sad generalizations are in E.2

Things like stating that "evolution has stalled" while also recognizing that the format is stable is hand wavy when you consider we're discussing things that go over the wire and even end up on disk. Yes, stability should be a goal.

The real difference between CBOR and MessagePack is that CBOR wants to be "schemaless" in the applications themselves instead of just on the wire. They hold up json as the example format for something that doesn't require schemas, and yet I see "json schemas" being published[0], and even people trying to standardize the schema format[1]! Looking at any modern JSON API would tell you that "schemas" in the xml sense are not required, but applications all must be very knowledgeable of the format.

Having a data type for "PCRE" is just insanity on the wire, and I can't imagine the type of API you'd be publishing where you would accept URLs or Regular Expressions or Text or Binary, AND want to be able to decode them into proper types in memory all without applications on both ends knowing that ahead of time.

Which brings me back to my initial point: CBOR is not just a "standardized Message Pack", it's a very different approach to what they think the applications on either end of a protocol should be doing.

[0]: http://json-schema.org/

[1]: tools.ietf.org/html/draft-zyp-json-schema-03


I've done quite a bit of work with CBOR. I took a quick look over the Message Pack spec and found it a bit wanting. The CBOR encoding is more consistent and it allows one to easily define extensions (up to 2^64 worth http://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml). One defined extension allows you to define references, which I found useful in encoding Lua tables.


I would go a lot further back than that. OSC (which I confess I don't know at all) sounds very similar to the DER encoding of ASN.1.


OP hasn't addressed the fact that UDP will willfully drop packets and send them out of order. What if I need to stream some data? What if I care about the order my IPC messages arrive in?


It's covered: "On the other hand UDP can suffer from packet loss and OSC is an obscure protocol". If you need something UDP doesn't handle, then it makes sense to use something else, but a lot of time is being wasted using something like TCP if you don't need the things it does. With live sound, milliseconds count and it doesn't take a skilled ear to hear when things are even a little later than they should be. It might even be better to have no sound than a late sound.


Rest assured that the designers of SuperCollider considered this use case when deciding to use OSC: http://doc.sccode.org/Guides/ServerTiming.html


Dropped packets might not matter if you send the latest state. Dropped packets can be detected as well if each UDP packet in the protocol has a sequence id, then receiving end keep last time and sequence id it got. So it know if it got packets out of order or duplicated packets. Yeah, you are doing at this point some TCP stuff, but in some cases it is worth doing.


Reimplementing tcp in udp is either very smart or very dumb. It's a great mid band filter for programmers. When it pays off, it's huge. But when it doesn't, it's death by corner case.


The article doesn't mention OSC Bundles, which is a group of OSC messages which are designed to be 'interpreted' together. A bundle usually has a timestamp associated with it.

I agree this might not be the best method of dealing with out of order reception of packages, but it does exist in the protocol.


a bundle always has a timestamp


Indeed. The example appears to be "spray/pray": https://github.com/joearms/music_experiments/blob/master/pd....


Agreed, but one specific point referenced by the Alan Kay quote is building small components that message each other over the loopback interface with UDP... Messages never leaving the original machine and there are fewer places to drop the packets.


Applications that use OSC are generally running on the same machine or a local network that you have control over.

You shouldn't drop packets on single machine unless you are sending a lot of them and run out of buffer. Buffer size is also tunable if you do have issues like that.

If you need to do things over the internet it would probably be best to do OSC over TCP.


Consider SCTP as an alternative if you want messages but want them in order.


Only if you have control of every device between your sender and your receiver. Don't get me wrong, I love SCTP, but good luck consistently making SCTP connections over the public internet...


Also QUIC[0] should be a good candidate to be used instead of UDP.

[0] https://docs.google.com/document/d/1lmL9EF6qKrk7gbazY8bIdvq3...



Then UDP won't work for your requirements, right?


UDP, without some layer on top of it for ack or retry, or ordering, isn't useful for many requirements. Maybe things like displaying stock quotes, where stale data doesn't matter after a few seconds?

I think the point is just to say that the article shouldn't tout plain UDP as a fix for TCP being hard in certain languages without calling out the downside.


TCP and UDP are different solutions for different problems. You also shouldn't tout TCP as a fix for UDP being hard without calling out the downside--TCP doesn't let you drop packets and can suffer from bad latency problems.


Yep. When suggesting a change from one to the other, it might be a good idea to explain the benefits and tradeoffs. I didn't suggest otherwise.

That said, most protocol implementations that I'm aware of over UDP have one or more of "retry/order checking/etc" on top of them. Like NFS, Bootp, tftp, etc.


> That said, most protocol implementations that I'm aware of over UDP have one or more of "retry/order checking/etc" on top of them.

But it's a common false belief that it would be "reimplementing TCP". Some guarantees of reliability are required for most practical applications but TCP's "in-order stream of bytes" isn't suitable for everything.

For any kind of real time data, when packet loss occurs, TCP will re-send data that may already be outdated, and causing later packets to arrive even later. In real time uses, TCP makes any networking problems worse.

It's also worth noting that under ideal conditions (ie. no packet loss), TCP and UDP behave almost identically (after startup, that is). The issues only appear when you're working in less than ideal conditions and are worse the longer the physical distances are.


"willfully" is probably overstating it


OSC is in use for a lot of DIY hardware interfaces fed into custom modular (software) synths, DSP, or weird art programs using Pure Data, Max/MSP, Super Collider, Ableton Live and MIDI bridges to other pieces of sound plugins or external music hardware.

Nice to see a good write up on this project!!

http://en.flossmanuals.net/pure-data/ch065_osc/

https://puredata.info/community/pdwiki/OpenSoundControl

http://livecontrol.q3f.org/ableton-liveapi/liveosc/


> nobody does this properly for purely local applications.

You may be interested in qmail and its sub-applications, which does it properly IMHO. It's a great example of the article's key point, which is the value of simple messaging.


Wow, odd that this comes up as I was casting around in another domain, but my working assumption is that I would implement my GPU processing library using a 'tags data' style binary protocol.

However, I was thinking of it more from a Amiga COPPER instruction list perspective, but it amounts to the same thing.

The really cool thing is that approaches like this tend to be language-agnostic, easy to program to, and extremely fast. All points raised in the article.


Tag lists more generally appears all over AmigaOS as well - especially as they started running into limitations with the existing APIs that depended heavily on fixed structs with AmigaOS 2.0, they more and more opted for adding taglist parameters to avoid running into the same situation again.

And while it's request-response based over message ports, personally I see AREXX ports in most larger Amiga applications as a good example of the overall approach of cooperating processes to build larger systems.

Dbus and similar systems today has lots of great functionality, but what they miss is what ensured AREXX was "everywhere": It was trivially simple to support.

Pointing to the simplicity of OSC in the article really resonated with me, because a lot of these systems are far less useful than they could be because it's extra work. Simple systems on the other hand end up shaping application structure:

AmigaOS was massively message based from the outset: Message ports and messages were "built in" and used all over the place. E.g. when communicating with filesystems or drivers or the GUI you do it via messages.

Then came AREXX and there was suddenly a standard message format for RPC - both application-to-application and user-to-application.

If Amiga-applications were message oriented before, a lot of "post-AREXX" Amiga applications took it to the next level by being shaped around a dispatch loop that included not only e.g. GUI updates, but defining internal APIs that exposed pretty much every higher level action in the application, and used that as an explicit implementation strategy.


The best thing about OSC IMO are the many wonderful configurable controllers. When I was messing with OSC a few years ago TouchOSC http://hexler.net/software/touchosc was my favorite

It is really fun to use these beautiful Star Trek looking interfaces to make your robots move around for example ;)


TouchOSC has been some kind of model app for me - was available on the iPad very early and is still going strong to this day while countless other controllers came and went.


So what happens if you want to transmit some data in a nested hierarchical format? Just pack it into some (ad-hoc?) higher level protocol inside one of the flat pieces?

Seems like a recipe for inevitable incompatibilities when two languages no longer have codecs for the new underspecified layered format.


Although the article may not have made it clear, OSC is designed primarily to pass around control signals, rather than arbitrary, non-contextualized data. Control signals have inherent meaning, like "fast-forward <n> seconds", "set color to <x>", etc. The important distinction is that "set color to <x>" (a command) is distinct from "{ 'color': 'x'}" (non-contextualized data).

The layout of controls is hierarchical in the protocol; each control has an associated endpoint, which looks very much like a unix file path. For example, to unpause an audio player, you would send the message "/player/is_playing/set t" (or f, to pause). If you wanted to control the bands of a 10-band equalizer, you could implement that as "/player/eq/band/4/amp/set 0.5" (there are certainly other ways to approach it too).

If you want to send control signals to multiple parts of the audio player at the same time, you could use a "bundle" composed of a separate message for each parameter you're altering. All the messages will then be handled simultaneously.

Because of this, data does not need to be hierarchical. If you really want to, you can consider the endpoints to be "keys" and the control data to be "values" in a hierachical key -> value map, just to prove that everything you could do with hierarchical data is still possible using OSC. But I would encourage you to not think of it in that abstraction because the fact that OSC is built to pass messages, and not non-contextualized data is a large part of what differentiates it from other protocols like json streaming.


Its a low level protocol. There are blob and string types, so you can pack and send anything you like. I send JSON via OSC all the time.


For an especially bad-ass implementation of OSC, check out the SuperCollider project at https://supercollider.github.io/. SuperCollider is a programming language for audio synthesis/analysis, and algorithmic music composition. It works using a client-server paradigm, and they communicate with each other via OSC.

Shameless plug time! I've been using SuperCollider to design a machine-learning solo electroacoustic music performance platform, called Sonic Multiplicities. You can hear it at http://www.sonicmultiplicities.net/.


Reminds me a lot of bencoding, which is not as popular as should (outside of bittorrent of course)


It's unfortunate that bencoding doesn't support text strings.


At least bencoding has dictionaries.


Binary protocols are only cheap if your time is free. If you can't connect to your service via telnet, you're going to spend a lot more time debugging. The only way to fix this is to develop a textual representation of your format, and write your own telnet, which is feasible, but nobody seems to have done that to OSC.

Also, I find Armstrong's fanaticism grating, and am inherently distrustful of anybody who claims that their community and language has all the good ideas. Because odds are, they're wrong.


> Also, I find Armstrong's fanaticism grating, and am inherently distrustful of anybody who claims that their community and language has all the good ideas. Because odds are, they're wrong.

You might find this[0] blog post by Armstrong interesting, wherein he comments on some things that were/are done wrong in Erlang. I would also argue that this 'fanaticism' is made to look more severe as you can find at least 4 different talks where Armstrong argues for the same things over and over.

In any case, Armstrong is actually very humble, which I think seems fairly obvious if you actually watch a few talks and read more than a couple of blog posts by him.

Could you point to something specific where you feel Armstrong misrepresents a strong side of Erlang, or is it just a general thing?

(Edit: Just to add, I think it should be said that the Erlang team created their favorite language and that it probably represents the most useful language in use today to them, so I don't think it's surprising that they'd see more good ideas there than in any other language. On top of that Armstrong is a very straight shooter, it seems, so I think this might be something to take into account.)

0 - http://joearms.github.io/2013/05/31/a-week-with-elixir.html


It isn't so much that Joe doesn't think that erlang did anything wrong, it just feels like he believes that erlang's way of doing things, and erlang's ideas are the only good way of doing things, and are the only right way of doing things. This makes me suspicious, because that is almost never true of any language or community.


I don't understand his emphasis on session on TCP and UDP. A session requires nothing more than a "unique" number to identify it. You can certainly have session in UDP; just attach a session number on all the messages belonging to the session. And you can certainly be sessionless in TCP; just close the connection after each message. HTTP request has no session by definition and it sits on top of TCP.


> just close the connection after each message.

That is if you don't expect to handle more than one connection at a time in a reasonable way. You can certainly do this loop:

   server_socket = create_and_listen_on_socket(...)
   while True:
      csock = server_sock.accept()
      csock.send(process(csock.recv()))
      csock.close()
TCP setup and teardown is not too cheap (especially compared to sending UDP packets).


UDP is certainly faster than TCP, no question about it, because of different levels of quality of service. However, coding for both is similarly simple, given the same requirement for short receive-only message. TCP session has nothing to do with the complexity. The complexity comes in how to do process() in parallel.

A receive-only message can be handled in TCP:

   server_socket = create_and_listen_on_socket(..., MAXCONN)
   while True:
      csock = server_sock.accept()
      msg = csock.recv()
      csock.close()
      process(msg)
A UDP loop has similar steps, minus the accept and close. The complexity to use threads or async to run process() is the same for UDP and TCP.

For a normal data volume remote command protocol, the simple TCP loop is more than adequate. Just set a reasonable MAXCONN to queue up client connection requests, which can drop connections if there are too many requests, just like UDP dropping packets.

Edit: I don't believe lock can be avoided in Sonic Pi server when handling concurrent incoming commands, whether it's written in Erlang or not. Sonic Pi I believe has a single audio device, which makes it a shared resource. Concurrent access to a shared resource has to be managed with lock somewhere along the call path. In that case a single thread TCP server is perfectly fine, serving as a lock as well.


Shouldn't you handle fragmentation? I don't think TCP guarantees that csock.recv() will give you all of the message.


Fragmented IP datagram is re-assembled at the IP layer before it is handed up to UDP or TCP layer. If it can't be re-assembled, the datagram is considered lost.

UDP is unreliable and has small packet size. The TCP code is emulating that simple requirement. Why expand the requirement? Looping to read fully would block on one connection. One rogue client would hold up the whole server.


I looked it up, and it seems that while IP will reassemble packets, there is no 1-1 mapping between send calls and IP packets, when using TCP.

You are only guaranteed to get the bytes in the correct order but you have to find the boundaries between messages yourself.


That's correct. A send call could call with a 1GB buffer and the IP layer would have to break it up into multiple datagram packets. That's the nature of TCP. Again sending large message is expanding the requirement beyond what is capable in UDP, while we were striking to emulate UDP in sending short unreliable message.


> I don't understand his emphasis on session on TCP and UDP. A session requires nothing more than a "unique" number to identify it.

... which requires three round trip times for the client and server to agree upon.

UDP gives the flexibility of sending "payload" starting from packet #1. There may be session identifiers in that package, too.


He didn't talk about performance at all. If he said TCP is slower than UDP, fine, that's a perfectly valid reason to use UDP.

But he was talking about TCP imposes the notion of session on the application while UDP doesn't, which are false. If he meant the TCP connection as a session, why he led the discussion to managing session with locks and threads in application. Then he talked of the ease of Erlang handling session while other languages having a hard time, which was an exaggeration. Session management is a solved problem, in many languages by many people.


Can anyone explain what a sequential programming language is?

Context here:

"I guess I’d underestimated the difficulty of implementing Anything-over-TCP in a sequential language. Just because it’s really really easy in Erlang doesn’t mean to say it’s easy in sequential languages."


I think he's referring to how Erlang uses the Actor Model. An Erlang program consists of (possibly) thousands of individual lightweight-processes. In this model a typical task may be spread out over many asynchronous processes. You can think of it as a program thats made up of thousands of micro-services. Instead of a task taking place over a sequential piece of code, it takes places over many individual processes that coordinate by communicating with each other.


Imperative programming. Lists of statements, vs Erlang's declarative programming, where you describe the program without specifying control flow.


Erlang is absolutely a fixed in-order control flow programming language, on message receipt. It is (generally) referentially transparent, but not declarative.


Oh, I don't know much about Erlang so I thought for sure that's what he meant. What was that supposed to mean, then?


Given that he is contrasting to Erlang, maybe he's talking about imperative languages instead of (pure) functional ones?


Erlang's pattern matching allows a great many things to be done in a declarative style. For some problems you end up with a lot of pattern stuff and not all that much "normal" code. Or you pattern match through what would ordinarily be a lot of flow control statements, for each case writing very simple code to handle that one thing.

Since Erlang isn't a declarative language in the larger sense, that's my guess what he meant.


My 5¢ is that 9P2000 protocol would be much better and canonical example.

Also I hope some day one would show how to implement most of functionality of Hadoop orders of magnitude more effeciently in bunch of Plan9 shell scripts. Or in Erlang with 9P protocol driver.


Is there any similarity between "tag-length-value" and netstrings (1997)?


Woah, funny I see this. I've been working on a distributed shell, basically trying to combine sockets and pipes to create remote pipes, aka inter-host piping.

Something like:

  $ echo foo | host2:grep bar | host3:grep baz


> basically trying to combine sockets and pipes to create remote pipes

Sockets, or ssh? To get a remote shell, you're going to want ssh.


From the title I thought that the ultimate monad tutorial had finally arrived.


The problem with flat messages is that optional composite types can be partially present.


gRPC seems like a good alternative.


Title probably should be edited to remove "Joe Armstrong: ". He might be badass, but he isn't a way to connect programs together.


I read the article because it said Joe Armstrong. That way I knew it was someone who knows what he's talking about. Especially given the topic.


Just curious, wasn't the hostname a clue?


Same here. If I have not seen "Joe Armstrong" in the title I wouldn't have clicked on it.


It was a play on words with the original title, reading ':' as 'is'

> Joe Armstring: A Badass Way to Connect Programs Together


He isn't a way to connect programs together, but he -is- a generator for connecting programs together. You just have to show him two programs and a reason to connect them, wait a bit, and there'll be a github repo. I hope I'm that inquisitive and playfully experimental when I reach his age.


How sure are you about that?


| ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: