The new Zig I/O idea seems like a pretty ingenious idea, if you write mostly applications and don't need stackless coroutines. I suspect that writing libraries using this style will be quite error-prone, because library authors will not know whether the provided I/O is single or multi-threaded, whether it uses evented I/O or not... Writing concurrent/async/parallel/whatever code is difficult enough on its own even if you have perfect knowledge of the I/O stack that you're using. Here the library author will be at the mercy of the IO implementation provided from the outside. And since it looks like the IO interface will be a proper kitchen sink, essentially an implementation of a "small OS", it might be very hard to test all the potential interactions and combinations of behavior. I'm not sure if a few async primitives offered by the interface will be enough in practice to deal with all the funny edge cases that you can encounter in practice. To support a wide range of IO implementations, I think that the code would have to be quite defensive and essentially assume the most parallel/concurrent version of IO to be used.
It will IMO also be quite difficult to combine stackless coroutines with this approach, especially if you'd want to avoid needless spawning of the coroutines, because the offered primitives don't seem to allow expressing explicit polling of the coroutines (and even if they did, most people probably wouldn't bother to write code like that, as it would essentially boil down to the code looking like "normal" async/await code, not like Go with implicit yield points). Combined with the dynamic dispatch, it seems like Zig is going a bit higher-level with its language design. Might be a good fit in the end.
It's quite courageous calling this approach "without any compromise" when it has not been tried in the wild yet - you can claim this maybe after 1-2 years of usage in a wider ecosystem. Time will tell :)
> It will IMO also be quite difficult to combine stackless coroutines with this approach
Maybe there will be unforeseen problems, but they have promised to provide stackless coroutines; since it's needed for the WASM target, which they're committed to supporting.
> Combined with the dynamic dispatch
Dynamic dispatch will only be used if your program employs more than one IO implementation. For the common case where you're only using a single implementation for your IO, dynamic dispatch will be replaced with direct calls.
> It's quite courageous calling this approach "without any compromise" when it has not been tried in the wild yet.
You're right. Although it seems quite close to what "Jai" is purportedly having success with (granted with an implicit IO context, rather than an explicitly passed one). But it's arguable if you can count that as being in the wild either...
> I think that the code would have to be quite defensive and essentially assume the most parallel/concurrent version of IO to be used.
Exactly, but why would anyone think differently when the goal is to support both synchronous and async execution?
However, if asynchrony is done well at the lower levels of IO event handler, it should be simple to implemcent by following these principles everywhere — the "worst" that could happen is that your code runs sequentially (thus slower), but not run into races or deadlocks.
That's my point: I don't see how there could be people dedicated to work on an issue as grand as this in Rust's current organizational form. Especially considering all the gotchas, and continuous development of 'more fun' things (why work on open source if it's no fun?). That's why it's 'the bedrock'.
To do something like that, Rust would need to be forked and later on rewritten with optimizations. But by then it wouldn't be "Rust" anymore, it would be a sibling language with rusty syntax. Rust++, perhaps.
> I don't see how there could be people dedicated to work on an issue as grand as this in Rust's current organizational form
You're getting half ways there of giving actionable feedback, what exactly is the problem with the current organization structure that would prevent any "grand" issues like these? Is there a specific point in time when you felt like Rust stopped being able to work on these grand issues, or it has always been like this according to you?
> why work on open source if it's no fun
It's always fun to someone out there, I'm sure :) There are loads of thankless tasks that seemingly get done even without having a sexy name like "AI for X". With a language as large as Rust, I'm sure there might even be two whole people who are salivating at the ideas of speeding up the current compiler.
>what exactly is the problem with the current organization structure that would prevent any "grand" issues like these?
Well, it's summarized quite well here:
>"Performing large cross-cutting changes is also tricky because it will necessarily conflict with a lot of other changes being done to the compiler in the meantime. You can try to perform the modifications outside the main compiler tree, but that is almost doomed to fail, given how fast the compiler changes8. Alternatively, you try to land the changes in a single massive PR, which might lead to endless rebases (and might make it harder to find a reviewer). Or, you will need to do the migration incrementally, which might require maintaining two separate implementations of the same thing for a long time, which can be exhausting." - OP
A rigid organizational form (such as a company) can say: "Okay, we'll make an investment here and feature freeze until the refactors are done". I have a hard time seeing how the open source rust community who are doing this out of passion would get on board on such a journey. But maybe, who knows! The 'compilation people' would need to not only refactor to speed up the compilation times, they'd also need to encourage and/or perform refactors on all features being developed 'on main'. That, to me, sounds tedious and boring. Sort of like a job. Maybe something for the rust foundation to figure out.
I understand that quoted part as "it's tricky" rather than "It's impossible because no one wants to do it", just like many collaboration-efforts in FOSS. But you're right that it's probably for the foundation to figure out, a lone compiler-optimization geek isn't gonna be able to come up with a project-wide solution and force it through.
Haven't the Rust team already implemented "grand features" that took many years to get across the finish line? For example, GATs didn't look particularly fun, exciting or sexy, but somehow after being thought about and developed for like 5-6 years eventually landed in stable.
Edit: Also just remembered a lot of the work that "The Rust Async Working Group" has done, a lot which required a large collaborations between multiple groups within Rust. Seems to have worked out in the end too.
This is a concern for any fast-moving project, i.e. it's a good problem to have! You can work on your modifications on a side branch and then forward port them to the current state of main before proposing them for merge, it will probably be less work overall.
I'm not sure how that works. You either let the compiler compile your whole program with AVX (which duplicates the binary) or you manually use AVX with runtime detection on selected places (which requires writing manual vectorization).
You have a fair point, I agree that while compiler performance is a priority, is is one of many priorities, and not currently super high on the list for many Rust Project developers. I wish it was different, but the only thing we can do is just do the work to make it faster :) Or support the people that work on it.
For many use-cases yes, but there are crates bottlenecked on different things than the codegen backend.
But I don't think that's the point. We could get rid of LLVM and use other backends, same as we could do other improvements. The point is that there are also other priorities and we don't have enough manpower to make progress faster.
I think a reasonable comparison would have to be DoD Rust parser vs current Rust parser. Comparing across languages isn't very useful, because Zig has very different syntax rules, and doesn't provide diagnostics near the same level as Rust does. The Rust compiler (and also its parser) spends an incredible amount of effort on diagnostics, to the point of actually trying to parse syntax from other languages (e.g. Python), just to warn people not to use Python syntax in Rust. Not to mention that it needs to deal with decl and proc macros, intertwine that with name resolution, etc. etc. This all of course hurts parsing performance quite a lot, and IMO would make it both much harder to write the whole thing in DoD, and also the DoD performance benefits would be not so big, because of all the heterogeneous functionality the Rust frontend does. Those are of course deliberate decisions of Rust that favor other things than compilation performance.
Your points here don't really make sense. There are many ways you can apply DoD to a codebase, but by far the main one (both easiest and most important) is to optimize the in-memory layout of long-lived objects. I won't claim to be familiar with the Rust compiler pipeline, but for most compilers, that means you'd have a nice compact representation for a `Token` and `AstNode` (or whatever you call those concepts), but the code between them -- i.e. the parser -- isn't really affected. In other words, all the fancy features you describe -- macros intertwined with name resolution, parsing syntax from other languages, high-quality diagnostics -- don't care about DoD! Our approach in the Zig compiler has evolved over time, but we're slowly converging towards a style where all of the access to the memory-efficient dense representation is abstracted behind functions. So, you write your actual processing (e.g. your parser with all the features you mention) just the same; the only real difference is that when your parser wants to, for instance, get a token (as input) or emit an AST node (as output), it calls functions to do that, and those functions pull out the bytes you need into a lovely `struct` or (in Rust terms) `enum` or whatever the case may be.
Our typical style in Zig, or at least what we tend to do when writing DoD structures nowadays, is to have the function[s] for "reading" that long-lived data (e.g. getting a single token out from a memory-efficient packed representation of "all the tokens") in the implementation of the DoD type, and the functions for "writing" it in the one place that generates that thing. For instance, the parser has functions to deal with writing a "completed" AST node to the efficient representation it's building, and the AST type itself has functions (used by the next phase of the compiler pipeline, in our case a phase called AstGen) to extract data about a single AST node from that efficient representation. That way, barely any code has to actually be aware of the optimized representation being used behind the scenes. As mentioned above, what you end up with is that the actual processing phases look more-or-less identical to how they would without DoD.
FWIW, I don't think the parser is our best code here: it's one of the oldest "DoD-ified" things in the Zig codebase so has some outdated patterns and questionable naming. Personally, I'm partial to `ZonGen`[0] as a fairly good example of a "processing" phase (although I'm admittedly biased!). It inputs an AST and outputs a simple tree IR for a subset of Zig which is analagous to JSON. Then, for an example of code consuming that generated IR, take a look at `print_zoir`[1], which just dumps the tree to stdout (or whatever) for debugging purposes. The interesting logic is in `PrintZon.renderNode` in that file: note how it calls `node.get`, and then just has a nice convenient tagged union (`enum` in Rust terms) value to work with.
I also don't know all the details, but the Rust parser tokens contain horrible crimes, primarily because of macros. All I wanted to say was that applying DoD to the parser in Rust would (IMO) be much more difficult than in Zig, because language differences and different approaches to error reporting. Not saying it's impossible ofc. That being said, I don't really think so much effort would be worth here, the gain would be minimal in the grand scheme of things; we have bigger perf. problems than parsing.
I have some random guesses as to why the 40% vs 60-70% memory issues percentage:
- 180k is not that much code. The 60-70% number comes from Google and Microsoft, and they are dealing with way larger codebases. Of course, the size of the codebase in theory shouldn't affect the percentage, but I suspect in practice it does, as the larger the codebase is, the harder it is to enforce invariants and watch for all possible edge cases.
- A related aspect to that is that curl is primarily maintained by one person (you), or at most a handful of contributors. Of course many more people contribute to it, but there is a single maintainer who knows the whole codebase perfectly and can see behind all (or most) corners. For larger codebases with hundreds of people working on them, that is probably not the case.
- Curl is used by clients a lot (probably it's used more by clients than servers, for whatever definition of these words) over which you have no control and monitoring. That means that some UB or vulnerabilities that were triggered "in the wild", on the client side, might not ever be found. For Google/Microsoft, if we're talking about Chrome, Windows, web services etc., which are much more controled and monitored by their companies, I suspect that they are able to detect a larger fraction of vulnerabilities and issues than we are able to detect in curl.
- You write great code, love what you're doing and take pride in a job done well (again, if we scale this to a large codebase with hundreds of developers, it's quite hard to achieve the same level of quality and dedication there).
(sent this as a comment directly on the post, but it seems like it wasn't approved)
That didn't even occur to me, tbh :) But it doesn't have to be SQL linting, I just wanted to appreciate the mindset of not being lazy/afraid to write an unorthodox test.
If I only insert data into the DB once, I could miss important states. Like, I could add non-NULL data to a NOT NULL column, then make it NULL, and then make it NOT NULL again. If I don't insert NULL into the column in-between the last two migrations, I won't trigger the issue.
If you're writing applications or tests, it's mostly the simpler kind of code. If you're writing reusable (or perf. critical code), you will start seeing generics and lifetimes much more often.
It will IMO also be quite difficult to combine stackless coroutines with this approach, especially if you'd want to avoid needless spawning of the coroutines, because the offered primitives don't seem to allow expressing explicit polling of the coroutines (and even if they did, most people probably wouldn't bother to write code like that, as it would essentially boil down to the code looking like "normal" async/await code, not like Go with implicit yield points). Combined with the dynamic dispatch, it seems like Zig is going a bit higher-level with its language design. Might be a good fit in the end.
It's quite courageous calling this approach "without any compromise" when it has not been tried in the wild yet - you can claim this maybe after 1-2 years of usage in a wider ecosystem. Time will tell :)