Why we wrote our Kafka Client in Pony

krylon · on Jan 30, 2018

Man, I really hope Pony grows a decent library ecosystem fast! The language itself is sooo nice, the type system is pure bliss (at least this side of Haskell), and I even managed to get beyond the first peak of "I have no clue what is going on here" to using the type system sensibly.

I do not care about scalability so much (I do not complain, either), but the language and especially the type system was an eye-opener for me. Having a guarantee - proof! - that entire classes of bugs that haunt software written in C, C++, Java will be impossible to smuggle past the compiler sounds like a realistic approximation to the four-dimensional compiler I once envisioned, that would retroactively turn any and all runtime errors into compile time errors (without creating a paradox, of course!).

reilly3000 · on Jan 30, 2018

4D compiler? That sounds amazing. I imagine that could be extended with a Kafka stream of errors from staging and production. I guess that would require a ‘system aware’ compiler.

sagacity · on Jan 30, 2018

What was the reason for not contributing a "reactive" API to the existing C client?

Especially because now you have a maintenance burden on your own client which, ironically, performs quite a lot slower than the C client although that was the major concern for not using it in the first place!

hackmanytrades · on Jan 30, 2018

That's a great question.

Yes, Pony Kafka is currently slower than the C client. But it is also almost completely untuned as of right now. We expect there is a lot of low hanging fruit on that front that will give us significant gains.

There is also the secondary concern regarding the thread pools internal to Pony and librdkafka. We've seen first hand how CPU cache invalidation can impact performance so we are very aware of the potential negatives if the Pony and librdkafka threads ever end up fighting with each other over the same CPU resources.

sagacity · on Jan 30, 2018

Sorry, that was my question-- was there a way to avoid using librdkafka's threadpool and essentially use it as a 'dumb' client, moving all the async stuff to your Pony actor layer?

hackmanytrades · on Jan 30, 2018

From what I understand of librdkafka, there's no easy way to disable the internal thread pool it uses.

I'd imagine that the internal thread pool for sending/receiving data from Kafka is as core to librdkafka as the internal thread pool for running actors is to Pony and trying to remove or disable either of them would be a large undertaking.

vbernat · on Jan 30, 2018

The existing API already features ability to integrate into any event loop. See rd_kafka_queue_io_event_enable() to provide a file descriptor to be notified of incoming data.

sidlls · on Jan 30, 2018

They made a decision some time ago that they were a Pony shop. Adding something written in a different language would from their perspective (perhaps) be the additional maintenance burden.

manigandham · on Jan 30, 2018

> Pony Kafka sends data to Kafka about 5% - 10% slower than librdkafka but reads data from Kafka about 75% slower than librdkafka.

What? So what's the point? Wouldn't it better to just contribute and optimize the existing clients then?

The Kafka/confluent team specifically chose to implement everything in librdkafka (because kafka is client-side logic heavy) and then make thin wrappers for every language so performance, bugs and stability can all be worked on in a single place.

dmix · on Jan 30, 2018

That was for what I'd imagine is v0.1 quality code, they said there were plenty of optimizations available to make it at least parity with the C client. Secondly, the issue was the system performance between Wallaro and the Kafka client, not necessarily the raw performance of the library.

hackmanytrades · on Jan 30, 2018

Yes, that's correct on both counts.

Pony Kafka is almost completely untuned as of right now. We expect there is a lot of low hanging fruit on that front that will give us significant gains.

And yes, we're concerned about the potential thread pool contention between Pony and librdkafka.

manigandham · on Jan 30, 2018

Copying data in memory between libraries is slower than actually getting the data across the network from the Kafka cluster itself? Seems like a strange bottleneck if so.

fulafel · on Jan 31, 2018

If they worked on the C client, they then have to work in C, which is bad. (This is one of the things covered in their Tradeoffs section, along with the problems with serialization and the C client's thread pool).

littlestymaar · on Jan 30, 2018

Pony is such a cool project, it's what Go could have been if it wasn't stuck in the middle of the eighties : it has a super expressive type-system, and a bit like in Rust, it gives you the data-race freedom.

I really wish it gains traction.

Avshalom · on Jan 30, 2018

(this has nthing to do with wallaroo or kafka but:) go isn't stuck in the 80s it just stuck on Rob Pike's belief that him personally being lazy and jury rigging in general is "simple" and a virtue.

acdha · on Jan 31, 2018

I don’t think that making criticisms personal like that accomplished anything useful. The fact that a large number of other people agree with his decisions suggests that it’s not “being lazy” and more balancing different goals. It would be far more productive to understand those pressures rather than simply assuming the worst.

Avshalom · on Jan 31, 2018

Any criticism of any project Rob Pike has been attached to or Pike himself has over the last 40 years accomplished nothing no matter if they are technical or personal.

hackmanytrades · on Jan 30, 2018

Hi, I'm the author of the blog post and the primary author of Pony Kafka. I'll be around to answer questions.

cs702 · on Jan 30, 2018

"Why we wrote our Kafka Client in Pony (wallaroolabs.com)"

I love the fact that the words "Kafka," "Pony," and "Wallaroo" are all together in one sentence, and not as part of an elaborate joke. I love even more the fact that Serious Business Executives will have to use these words to discuss useful technology. Awesome names.

hackmanytrades · on Jan 30, 2018

Just one of the many perks of working here at Wallaroo Labs. 8*P

AccountCreated · on Jan 30, 2018

> I love the fact that the words " "Kafka," "Pony," and "Wallaroo" are all together in one sentence, and not as part of an elaborate joke

Why are these two things mutually exclusive?

_wc0m · on Jan 30, 2018

First - I spent a good amount of time with Pony attempting to make something similar to Wallaroo (on a much smaller scale), and it seems I hit the same problems as you re:FFI. Does it worry you that creating idiomatic C wrappers in Pony feels 'wrong'? Are there ways to make it better? After all, young languages especially are very dependent on wrapping C libs for basic functionality.

Second - from what I can tell, Wallaroo Labs are now one of the main contributors to Ponylang. How are your experiences from writing Wallaroo affecting language design and direction?

hackmanytrades · on Jan 30, 2018

Great questions!

Re: FFI. In general, using the C FFI and creating C wrappers is not a major issue (aside from them possibly not being idiomatic) and, as you mentioned, required for a language as young as Pony. In fact, Pony Kafka internally relies on a number of C libraries via FFI. However, for the Pony Kafka use case, performance was a key driver. This meant that the risk of contention between the librdkafka thread pool and the Pony thread pool was one that had to be taken into account for long term performance goals. Same regarding the polling nature of getting data from librdkafka. Neither of these would have been a major issue had performance not been a high priority for Wallaroo. In regards to the idiomatic C wrappers, I think it's possible to wrap C libraries in an idiomatic way. It's not easy though because it's hard to figure out the right abstractions in Pony and map them to the C functionality. I don't think that's a problem though because not everything has to be idiomatic from day one. Over time, the C wrappers can be improved to be more idiomatic and/or phased out via native implementations.

Re: Ponylang direction. My personal experience on this is that the Pony community has been very receptive to our ideas. I've mainly focused one the Pony runtime side of things and less on the compiler/language design side of things though.

lmm · on Jan 30, 2018

Kafka is written in Scala, though it exposes a Java interface for ease of use.

hackmanytrades · on Jan 30, 2018

Thanks! Blog post corrected.

princess-aslaug · on Jan 30, 2018

Ehm, pick a niche programming language, build your client for a complex system. All cool, but it's not the straight line to solve problems in a company in the simplest and most effective way. Good thing if you have some spare R&D time maybe but hardly a sane approach for most.

fnord77 · on Jan 30, 2018

The investors must be thrilled.

mac01021 · on Jan 30, 2018

As someone who has worked on building a custom consumer API for kafka (in a distant past, before the 9.0 Java consumer came out) I am curious about how much engineering effort this required.

I can see from github that the project is about 16k lines of code. I wonder how many developers worked on it, how much of each of their time it required, how many false starts in the architecture of the library had to be abandoned...

hackmanytrades · on Jan 30, 2018

99% of Pony Kafka is written by me (for better or worse). I've been working on it since about May 2017 off and on with it being my primary focus for the majority of that time. However, due to working arrangements and other commitments, I've only spent about 12 or so weeks of time on it (where 1 week is equal to 5 days and 1 day is equal to 8 hours).

There have been a few iterations on the abstractions and API of the library but the majority of the architecture has been the same from the initial design sketch. I started by envisioning the features the library needed for end users and also internally in order to fully take advantage of Pony's actor concurrency model. From there I worked out the data and functionality ownership of the various bits (i.e. which actor does what and why). Lastly, I ran it by a couple of folks here at Wallaroo Labs to make sure I wasn't making any obvious mistakes.

The biggest change so far has been caused by building in the leader failover handling in relation to the data/responsibility ownership transferring from one actor to another. That's not entirely completed yet but it has mostly been an internal change. The end user API has also changed, but that has mostly been about fixing abstractions and/or data ownership issues.

I'm sure there will be additional changes as I have time to go through to fix abstractions and add in other features. Dynamic configuration changes, exactly once semantics, and group consumer functionality are all likely to impact the end user API along with requiring internal changes.

ramchip · on Jan 30, 2018

Out of curiosity, what pushed your team towards Pony rather than Erlang? It seems your team (or at least part of it) has experience with both languages which is interesting.

spooneybarger · on Jan 30, 2018

Short answer is in another thread here.

After discussing our performance goals with folks we knew at Basho, they expressed a lot of skepticism that we could meet them with Erlang.

How to plug-in potentially heavy user computations in non-Beam based languages is an incredibly tricky problem as well.

If you'd be interested in chatting more, I'm happy to that. See: https://news.ycombinator.com/item?id=16266220 for more information.

rurban · on Jan 31, 2018

Dipon, why not rewriting Kafka itself in Pony? The biggest problem is the synchronous API, and as pony service it would be much better, being controlled by async actors, without polling. Scala is very similar to pony.

atombender · on Jan 31, 2018

I'm not overly familiar with Pony, but I'm curious, and the code looks nice and clean. One oddity though; so many of the identifiers have "Kafka" in them. Does Pony not have module namespacing?

spooneybarger · on Jan 31, 2018

It does.

But the point you raise...

"Should this be HTTPLogger or Logger given that it is in the HTTP package"

and variations thereof is something that has been a point of contention at almost every job I've been at.

by default with Pony if you use a package, you'll have the classes imported directory into your namespace so...

HTTPLogger is more clear in that case, but you could use a qualified import and then have something like http.Logger.

It's a matter of preference.

atombender · on Jan 31, 2018

I understand, but the sheer amount of duplication is rather overwhelming. Also, a lot of it seems like implementation details related to the API/protocol and so on that don't need this kind of naming uniqueness.

Go solves this by never dumping namespaces into another namespace: You have http.Request, and that's it, which is both unambiguous and self-explanatory. Name clashes can occur (e.g. packages have the same name, or a local variable has the same name), but that's rare.

StreamBright · on Jan 30, 2018

I really like why we wrote x in y sort of blog posts. It is almost always like this: because we could or because for our use case it works.

sidlls · on Jan 30, 2018

It's actually almost always a post-hoc justification for scratching an itch.

StreamBright · on Jan 30, 2018

Exactly. Pony looks interesting though. I am wondering how is it comparing to Erlang. Seems like they are trying to address the same problem (actor model with memory safety) with different approaches.

DerBesserWisser · on Jan 30, 2018

TL;DR

Because it's more fun to learn a new programming language than deliver features to users. We're VC financed so no need to get money, also spend it before people talk about profitability, then the good times are over (see Etsy). We also can add Pony to our CV and move on in one year to the next company where we will introduce the next big thing to add to our CVs. Plus 10% more salary! Kaching! #LivingTheLife

lbill · on Jan 30, 2018

It might be a bit early to accuse Walaroo of wasting money into "fun for coders" and ignoring the market's actual needs.

When I read their blog, I get the feeling that they are thinking things through. They seem to target a specific market and focus on what can make their product desirable. And yes, they are definitely experimenting on a few things but they are well aware of the trade off: it appears to be a calculated risks, and might very well pay off in the near future.

[EDIT]: typo

swsieber · on Jan 30, 2018

Given what I know of Pony, WhatsApp and Erlang, it reminds me of WhatsApp's decision to write their backend in Erlang.

And it is a risk - big whoop, you've written production code in Pony. Good luck doing that again elsewhere, or hiring someone competent in it. I don't think it's quite hip or popular enough to be very useful on your resume (but I could be wrong).

dna_polymerase · on Jan 30, 2018

> Good luck doing that again elsewhere

If keep doing stuff because its done elsewhere we stop innovating.

> it reminds me of WhatsApp's decision to write their backend in Erlang.

WhatsApp's decision to use erlang is based upon totally valid points. In fact Erlangs concurrency model made scaling so easy for them, they only needed 50 engineers overall for handling 50B messages a day. [0]

You might not know Pony, that doesn't mean it couldn't make sense for someone else.

[0]: http://highscalability.com/blog/2014/2/26/the-whatsapp-archi...

CapacitorSet · on Jan 30, 2018

That's not the impression I got from the article, which cites performance, API architecture and integration with existing codebases.

Is the "dr" in "tl;dr" literal?

sidlls · on Jan 30, 2018

Your comment and the parent aren't mutually exclusive. It's fairly easy for coders to provide vast amounts of technical justification for resume driven development.

dmix · on Jan 30, 2018

Is a new immature language like Pony really that desirable in the marketplace? I highly doubt 'resume development' was ever a primary motivation.

If it was anything beyond the strategic interests of the company it's that developers love a) clean code and possibilities of a fresh new project and b) playing with new modern toys.

spooneybarger · on Jan 30, 2018

Hi VP of Engineering at Wallaroo Labs here.

We put a lot of thought and experimentation into our decision to use Pony. It wasn't a decision we made lightly.

I wrote a post a few months back that covers that decision:

https://blog.wallaroolabs.com/2017/10/why-we-used-pony-to-wr...

joncrocks · on Jan 30, 2018

The characterisation of JVM GC behavior seems slightly unfair.

Saying that JVMs are 'stop the world' and Pony is 'concurrent' feels like it's ignoring modern JVM GC strategies.

It might be true that you can avoid stop the world collection for an actor-system, but by logical extension would that not be possible on the JVM as well for that particular workload, given a suitably designed actor-system?

spooneybarger · on Jan 30, 2018

Its possible yes. At the time we started working on Wallaroo, the only real option for concurrent GC on the JVM was Azul. And we didn't want to tie our product and our goals to another company's commercial offering.

My intent was not be unfair. There's a lot of nuance in the topic that can be hard to cover in a more general blog post. Garbage collection is a fascinating topic, there's a great amount of detail that is left out in that post. I was going to a broad overview of general thinking.

[edited for typo]

exabrial · on Jan 30, 2018

I think this statement is a reach. There are plenty of concurrent collectors for the JVM, and there are tons of architectural and coding strategies to mitigate inconvenient GCs. This is not a one-of-a-kind problem, this is well studied and the path is well trodden. Azul is a convenient solution that wouldn't involve having to do a whole lot of case study, but it's certainly not the only one. And this is as-if perfect deterministic latency was the only factor that was important (then why not use C?).

What it sounds like instead was very minimal benchmarks or science was performed ahead of time, then a lot of justification written afterward. I know that's a reach as well, but this is a well-trodden path, so the answer "Pony is the best possible solution for the interest of our business" just seems like a very strange conclusion.

riku_iki · on Jan 30, 2018

> At the time we started working on Wallaroo, the only real option for concurrent GC on the JVM was Azul.

Curious if you did any benchmarks for Pony against standard JVM Concurrent Mark Sweep GC, which aims to reduce GC induced pauses?

spooneybarger · on Jan 31, 2018

It's not really an apples to apples comparison but before we went with Pony we did try out a number of things on the JVM.

dmix · on Jan 30, 2018

I see the argument/utility in using the right tool for the job, even if it was a slightly new language, more than most people. But I have a higher tolerance for risk more than most people (and have experienced the downside costs of those choices, they are very real).

Chosing newer languages gets maligned far more often than pigeonholing the wrong old languages onto problems.

Besides someone's got to take risks with (potentially) better technology, as long as they know the risks and fully considered them going in, then by all means. Plus as long as we continue to use C for all systems development the longer we'll have preventable security issues.

lod723 · on Jan 30, 2018

I've read that blog post and some of your other posts on HN. I get why the JVM, C/C++, and Go were not fits. However, I have not seen a lucid explanation of why you didn't go with Erlang or Elixir.

spooneybarger · on Jan 30, 2018

That's a complicated topic. The short version is:

We looked at Erlang. Several of us are friends with folks who worked at Basho on Riak and we talked with them about our performance goals. They were very skeptical that we could meet them using Erlang. Based on that, we moved on from Erlang.

lod723 · on Jan 30, 2018

It's worth posting the long version. I've been following the various Pony blog posts with interest (I'm a both language geek and a distributed systems geek), but I always come away with the notion "Huh, kinda cool, but why didn't they just use Erlang, it'd be a great fit for this".

So either:

a) Erlang is not a good fit, and I'm wrong. Then I'd really like to know why I'm wrong!

b) Your friends at Basho led you astray. Would also be interesting to know what happened in this case!

Either way, without knowing more details, the short version you just posted is inconsistent with the claim that you guys did serious research into existing language ecosystems before going your own way.

danielvf · on Jan 30, 2018

Erlang, while having many virtues, is simply slow. Once, I reimplemented in Elixir a toy data science tool I had previously built in node. Idiomatic node, idiomatic Elixir, both written for readability. The Elixir was approximately 100 times slower than the node version.

Now Erlang often feels fast, because of the architectures it allows, but when you get down to shuffling bytes around or doing low level math it is currently slow, slow, slow.

Given Wallaroo's speed goals, I would have been really surprised had they used Erlang:

http://benchmarksgame.alioth.debian.org/u64q/nbody.html

dnautics · on Jan 30, 2018

I mean really though you shouldn't be using BEAM languages for scientific and computational tasks. For starters, there isn't a native array type (everything is lists). That's fine, because lists can give you flexibility while preserving immutability guarantees when passing across functions.

If you're doing an n-body simulation, then this benchmark is a good benchmark for deciding whether or not to use erlang/elixir. If you're doing a server which is mostly parsing JSON inputs over HTTP and spitting out more JSON with HTTP, and needs to handle thousands or millions of parallel connections without hiccuping, is an nbody simulation the right thing to use as your benchmark reference?

dmix · on Jan 31, 2018

It's well known Erlang is not suited for data science and number crunching. I'm curious why you even bothered, near every introductory guide I've read makes this clear.

So that's not a good example of Erlang/Elixirs performance which is hardly known to be 'slow'. The language and process/actor model is far faster than many other languages particularly in the web space.

The author's also mentioned heavy dependency on the actor model as a performance optimizing strategy and optimal code structure which is why it likely is worth fully exploring for the OPs problem.

dmix · on Jan 30, 2018

I too would love to hear a long-form answer to this. Erlang seems like a good fit for this problem, besides maybe packaging up the client in an easily usable fashion.

spooneybarger · on Jan 30, 2018

"this problem" is very broad and there are aspects of it that Erlang is indeed a very good fit for. There are aspects where it is less so.

I think this is a rather in-depth conversation where HN comments aren't the most productive mechanism. If either or both of you are interested in chatting more on this, my email is sean@wallaroolabs.com. Drop me an email and we can arrange a time to chat.

jnordwick · on Jan 30, 2018

Have you ever made a decision to not use Pony (or whatever the new sexy tech)?

I've read a few blog posts now and the end result always seems to be Pony.

You even talk up how great the C and Java client libraries are. Well they can't be as great as you say or you would have used them.

The C library seems to perform better, be more featureful, and tested better. So once again, why? It certainly can't be because you library is going to top the C library in any way. The article even seems to imply you'd be happy being at parity with the C lib.

hackmanytrades · on Jan 30, 2018

In regards to the C and Scala/Java client libraries. They are great for what they do and how they do it. However, that doesn't mean they're ideal in every scenario. For example, the Scala/Java client is the most feature rich client and is actively developed in sync with the Kafka brokers. This, however, doesn't make it suitable for embedding in other languages. As a result, the C client was created by the community and is now officially supported by Confluent. That doesn't in any way take away from the quality of the Scala/Java client though.

Also, while the C client is more featureful and better tested, there is still the concern regarding the thread pools internal to Pony and librdkafka. We've seen first hand how CPU cache invalidation can impact performance so we are very aware of the potential negatives if the Pony and librdkafka threads ever end up fighting with each other over the same CPU resources and would prefer to avoid that.

Yes, Pony Kafka is currently slower than the C client. But it is also almost completely untuned as of right now. We expect there is a lot of low hanging fruit on that front that will give us significant gains. Yes, we mention in the blog post that we would be happy at being parity with the C client but our goal has always been to exceed it, eventually. Both in terms of performance and features.

jnordwick · on Jan 30, 2018

I'm coming from this as somebody who often has to rewrite a lot of library code because of certain performance issue and poor decisions from library writers often regarding things like garbage collection and hidden resources, like thread pools or an event loop, that cannot be hooked into. I see it all the time. I can no longer count the number of times I've had to rewrite parts of the JDK or networking libraries because of these issues.

Now, this is what I'm hearing from what you are saying:

> 1- We can't use a JVM implementation because we aren't using a JVM language.

Makes sense.

> 2- The C library is okay, but hides its thread pool with no way to access it.

Ugh. Hate that. Its like these people writing these have never had to use them in a real project. The sign of a mediocre library.

Pony's actor model might have to rewrite almost any library used by it when concurrency is involved.

But now, I think you answered your own question in the titles now:

> Why we wrote our Kafka Client in Pony

1- Because the C library is mediocre and hides its threads from users making it not very useful for high-performance applications.

2- Because the rest of the system is in Pony. Really, you could write it in C/C++ or even Rust as long as you wrote it in a way that played well with Pony's concurrency model, but why bother with that extra effort, especially if you believe - as you seem to - that Pony's concurrency story is superior.

spooneybarger · on Jan 30, 2018

Once we made the decision to use Pony for Wallaroo, that has driven a lot of our other choices. The Java and C client libraries are excellent. We had architectural concerns about how the thread pool in the clients would interact with our scheduler threads.

There's a large performance improvement we get by having a single scheduler thread for each CPU. The performance impact of that is very large. Adding another threadpool that competes for CPU usage would be problematic.

Our client is for those high-performance use cases where if we can get parity or close to parity with the C client then we should get much better performance due to those architectural concerns.

That said, we plan on providing a way for folks who are less concerned with performance to use the C client library.

In the end, it was less about "use Pony" and more about "do this in a way that matches with Wallaroo's architecture".

sidlls · on Jan 30, 2018

> In the end, it was less about "use Pony" and more about "do this in a way that matches with Wallaroo's architecture".

This could be accomplished without using Pony, unless you view "Wallaroo's architecture" as being effectively synonymous with "use Pony."

klibertp · on Jan 30, 2018

> This could be accomplished without using Pony

Sure, but why should it be accomplished without Pony? Languages are optimized for use-cases. This means that some languages are good and some are worse at handling particular use-cases. If Pony is the best choice for their use-case, why would you not choose it? Taking all the risks of a new tech into account, of course.

brazzledazzle · on Jan 30, 2018

If no one takes the plunge how do languages af technologies ever get proven? A startup without bureaucracy, institutional legacy and technical debt seems like a good place to do it.

rurban · on Jan 30, 2018

At least it is more mature than rust. And faster and safer also.

HumanDrivenDev · on Jan 30, 2018

Ha! You might as well say "Beetlejuice" 3 times while looking into a mirror.

dsnuh · on Jan 30, 2018

Minor quibble: no mirror needed to summon Beetlejuice, only Candyman and/or Bloody Mary. ;)

HumanDrivenDev · on Jan 30, 2018

This is why I keep quiet when pub trivia turns to popular culture.

rgrieselhuber · on Jan 30, 2018

I think there is a hipster element to language selection in situations like the parent refer to. I've certainly seen this at play.

sidlls · on Jan 30, 2018

I don't know about "hipster" unless you're using it as a synecdoche for "trend following."

There is definitely a predilection in certain parts of the coder community to prefer newness and difference over tried and true. There isn't anything wrong with that necessarily: it's part of how progress is made. However I think it's often taken to extremes in the coder community.

klibertp · on Jan 30, 2018

You jealous?

Also, picking a right tool for the job can be a real advantage, which makes it easier to deliver features.

Getting so close to C implementation (in terms of speed) with Pony is actually insane if you look at the number of guarantees Pony gives you. Next time you dereference a NULL pointer please remember that it's impossible in Pony. Oh, and next time you spend a week debugging some hairy locking issue, consider that issue wouldn't happen in Pony at all. EDIT3: removed EDIT1 from here.

Currently, Pony is in direct competition with Go (but uses the other concurrency model) and Erlang/Elixir (but is natively compiled). People and companies frequently choose Go or Erlang, so I don't really understand why they shouldn't choose Pony if their use-case fits.

EDIT2: And here I am getting downvoted... I wonder, is anything I wrote not true?

jetti · on Jan 30, 2018

>so I don't really understand why they shouldn't choose Pony if their use-case fits.

The difference between Pony and Go or Erlang (or even Elixir) is that the Pony team still are making breaking changes to the language. That means that your dev team may need to spend time to update features due to breaking changes in the language. Also, the ecosystem isn't there like it is for Go or Erlang.

klibertp · on Jan 30, 2018

Yeah, but both Go and Elixir (Erlang less so - commercial and internal PLs work a bit differently) were in the same situation at some point: very small ecosystem, small community, lots of changes to the language. Adopting a language at this stage of evolution has a set of very well-known risks, but it has to be done by someone for the language to ever reach maturity. Trying to use it seriously is one of the best ways to contribute to the language.

In any case, if you are aware of the risks and plan to mitigate them - by, for example, employing people capable of debugging and fixing the language's implementation - you're left with some risk and a lot of advantage (if you're lucky and your domain is indeed the one your language is best suited for). It's a gamble, of course, but then nearly every decision (other than buying IBM) is one.

spooneybarger · on Jan 30, 2018

Those are excellent points that we considered when we went with Pony. So far, we feel it has worked out well. A large number of those breaking changes have originated with us at Wallaroo Labs so they've been pretty easy for us to stay on top of.

manigandham · on Jan 30, 2018

Actor style concurrency exists for many other language platforms. See Akka for JVM/Scala or Seastar for C++.

I'm somewhat skeptical of GC pauses being a problem in anything that's not actual hard real time like avionics or manufacturing equipment or similar. What difference will a few hundred millisecond pause even make in an distributed async data pipeline? And that's on the higher end of pauses these days.

spooneybarger · on Jan 30, 2018

There's an excellent paper on GC pauses and its impact on Spark and Hadoop. "Trash Day" => https://www.usenix.org/node/189882

manigandham · on Jan 30, 2018

Sure, I understand pauses happen and there is a performance degradation, I'm asking whether it really matters in a processing framework that isn't controlling medical equipment or airline hydraulics. Is something going to break if there's a small pause? Especially in return for the productivity and safety of using managed runtimes?

sidlls · on Jan 30, 2018

It's the aggressive tone larded with use of cherry-picked negative aspects of using other languages and irrelevant details, I suspect.

klibertp · on Jan 30, 2018

> cherry-picked negative aspects of using other languages and irrelevant details

I don't understand? Why are they irrelevant? They're basically Pony's reason for existing... Cheap, efficient and safe concurrency, coupled with very high level of type-safety, is the main selling point on Pony. It is much better on these counts than many other languages. Maybe I shouldn't have mentioned Python and Ruby, I'll edit the post.

> It's the aggressive tone

I see. Compared with the charming politeness of the OP comment, I must have sounded really rude. I apologize.

ianleeclark · on Jan 30, 2018

I don't know if 5-10% difference in write speed and 75% in read speed is "so close to C implementation." These are both operations that will be happening thousands/millions/+ times per day, so it feels incorrect to say they're close.

These read/write differences will compound to make the rest of the data processing pipeline slower.

wtetzner · on Jan 30, 2018

They already commented on that performance:

> Yes, Pony Kafka is currently slower than the C client. But it is also almost completely untuned as of right now. We expect there is a lot of low hanging fruit on that front that will give us significant gains.

>There is also the secondary concern regarding the thread pools internal to Pony and librdkafka. We've seen first hand how CPU cache invalidation can impact performance so we are very aware of the potential negatives if the Pony and librdkafka threads ever end up fighting with each other over the same CPU resources.

klibertp · on Jan 30, 2018

For the first, proof-of-concept, no-optimizations-applied, version of a low-level library written from scratch in a high-level language to be anywhere near production-ready, presumably optimized C implementation is actually very impressive. I'd expect the new implementation to be an order of magnitude (or more) slower than the C one, initially, and only get better with many rounds of optimizations in the following months.

AHTERIX5000 · on Jan 30, 2018

Jealous of having to rely on Kafka client with not too much of testing or performance written from scratch on top of experimental language?

klibertp · on Jan 30, 2018

Obviously, I was referring to this part of GP post:

> add Pony to our CV and move on in one year to the next company where we will introduce the next big thing to add to our CVs. Plus 10% more salary!

It just sounds like venting to me, without any rational or fact-based argument for why using Pony is bad in this specific case. Doesn't it?

KirinDave · on Jan 30, 2018

So wait, engineering has no value, progress should only come from research departments at big corporations, we should never engage in anything outside of product and ad work, and anyone who applies research is stealing time from their employers.

Maybe, just maybe, we're starting to demand more of career software engineers. Maybe, just maybe, it's time for folks to stop grousing about how value-driven they are and realize that bad engineering is more expensive than good engineering over even medium term timescales.

imtringued · on Jan 30, 2018

To me Pony looks like a mix between Kotlin/Ceylon and Rust and at the same also happens to pickup all the low hanging fruit of programming language features that most "modern" languages haven't even attempted to pick up yet.

Rust has some flaws. Kotlin has some flaws. Pony appears to have even less flaws than either.

Of course I have merely read the tutorial. In practice it could be worse than either.

steveklabnik · on Jan 30, 2018

I often describe Pony and Rust as fellow travelers. They're following roughly the same path, but they have different constraints, and so they've made different choices that are suitable for what they need.

preordained · on Jan 31, 2018

Yep, we don't want to talk about this...and maybe it's unfair this particular example is getting focused on...but yes, it does have many of the hallmarks of the software industry's dirty little secret: we like shiny new stuff, and we've gotten really good at convincing ourselves and others that novelty is not holding an undue influence in our decisions.

0xdeadbeefbabe · on Jan 30, 2018

I think it starts to get interesting when you substitute "program" for "programming": because it's more fun to write a new program than deliver features to users with an old program. Now fun looks more suspicious.

You know HN is written in ARC, right? I'll just jump to the end: they did it for the features, and having fun just happens.

AccountCreated · on Jan 30, 2018

exabrial · on Jan 30, 2018

^ Sometimes the truth is a little brutal

KirinDave · on Jan 30, 2018

And sometimes the truth is that the author really thinks that way,they're probably a defensive, mediocre person in software.