What part of Rust compilation is the bottleneck?

choeger · on March 16, 2024

Monomorphization.

For every generic function f, rustc will generate as many instances as there are type instances (shape instances? Does the Compiler distinguish between different kinds of references that all get compiled to pointers?).

This feature has a cost. Compare to OCaml's uniform object representation that enables comparatively blazing compilation performance but pays a prize in performance and weird FFI restrictions (integers with a tag bit).

Btw. It's misleading to say "it's the backend" when the frontend is responsible for creating so much work for it.

flohofwoe · on March 16, 2024

> ...when the frontend is responsible for creating so much work for it.

That's the most important point I think. Clang compiling typical C code is very fast, but the same Clang compiling typical C++ code is very slow. Both use the same LLVM backend.

pjmlp · on March 16, 2024

Kind of, despite its slow builds fame, it is possible to have relative fast builds in C++ with monomorphization.

By using binary libraries, external templates for common type sets, incremental compilation and linking, and nowadays (at least for VC++ already) modules.

What Rust still lacks is having sound alternatives to LLVM, or someone supporting similar workflows in Rust.

Using OCaml as an example, it is great to have multiple backends in the box, plus an interpreter, and pick and choose during development workflows.

sigsev_251 · on March 16, 2024

Maybe cranelift will help with this. Faster compile times is one of its selling points.

pjmlp · on March 16, 2024

That is something that I also look forward to.

zozbot234 · on March 16, 2024

Monomorphization can be manually addressed in Rust by writing generic function impls as delegating to a single function where the generic pameters are partily or fully omitted - a kind of "polymorphization". This is a pretty common pattern, e.g. in the Rust std library. In more recent versions of Rust this can be expanded via the use of const generics, e.g. to express the size and alignment of a generic type parameter, where the implementation only depends on these. So this kind of "polymorphization" can be applied more broadly.

VHRanger · on March 16, 2024

D does the same thing with its template system and the compilation is still extremely fast.

Alifatisk · on March 16, 2024

May I take the opportunity to ask, what’s the reason for Metas somewhat heavy use of Ocaml? What’s the appeal? You already pointed out the insane compilation perf.

dermesser · on March 16, 2024

It's a pleasant and practical language to write, yet fairly safe.

Imagine the safety of Rust but looking more like Python (or Haskell...), the concurrency of Go (since V5), and without a borrow checker (but a GC instead).

davidmurdoch · on March 16, 2024

I watch some intro to OCAML videos, got excited about the languages features, then tried reading some real OCAML (Tezos, which was touted as the star of idiomatic OCAML projects - can't find the site that listed it now), and I found it so incredibly dense, hard to read, and almost completely devoid of meaningful naming and comments. It felt similar to reverse-engineering minified code to me.

whateveracct · on March 16, 2024

sounds like unfamiliarity. OCaml isn't a language a layman can read without prerequisites.

davidmurdoch · on March 16, 2024

I found the idea pleasant, but not in actuality. So maybe it's an acquired taste? Haha

whateveracct · on March 16, 2024

oh it definitely is. A lot of Haskell can look like you describe, but it's perfectly legible if you have enough reps under your belt. I find normal languages hard to read nowadays.

ad-ops · on March 16, 2024

With all the recent improvements to compilation speed (nightly, cranelift, mold-linker), Rust has become much more pleasant. Trivial and incremental changes to a medium sized crate like rust-analyzer (~200k loc) takes around 2.5s and a small Axum project takes around 0.5s.

These are my very subjective hobby benchmarks running archlinux on an AMD 9 7940HS.

Of course the initial build or the release build take much longer, but it makes me hopeful for the future.

andenacitelli · on March 16, 2024

Woah, 200k LoC is considered medium? I work at a Series A startup and our entire product (which is actually much more than a CRUD app) is only in the high tens of thousands, so that’s just a funny thought for me.

My theory is that because Rust is a low level language you tend to miss out on higher level primitives that promote more code reuse. Another theory is that Rust is mature but not quite as mature as something like Java, so there are fewer mature dependencies for you to delegate your work to.

Thoughts on what’s accurate? For context, I’ve written a bit of Rust myself, but am definitely a beginner.

pornel · on March 16, 2024

Rust is very good at code reuse.

Generics and cross-crate inlining enable zero-(runtime)cost abstractions, meaning there’s usually no perf downside to using 3rd party code instead of your own.

Strict type system, standardized error checking, thread safety in interfaces, and built in tooling for API documentation makes using libraries relatively easy.

The ecosystem is pretty large now, and has a culture of respecting semver, and focus on safety and reliability.

Cargo makes adding dependencies easy (the most common complaint is that it’s too easy, and people use too many dependencies).

paulddraper · on March 17, 2024

You know Series A is early stage, right?

KingOfCoders · on March 16, 2024

One reason for me moving from Rust to Go was compilation speed. Go is a simpler language, so apples to oranges, but Go compiles so fast, which to me makes development very different.

pjmlp · on March 16, 2024

OCaml compiles just as fast without having to compromise on the type system.

kjksf · on March 16, 2024

But then you have to compromise on speed of generated code, poor support of windows, number of libraries and overall ecosystem, no ability to generate standalone executables and probably more but I tried OCaml only briefly so can't speak to all of its shortcomings.

pjmlp · on March 16, 2024

OCaml optimizations are certainly better than Go compiler that hardly does inlining and only recently got some PGO support, and those that care about using LLVM or GCC backends have to compromise on fronteds that still don't do generics.

Go support on Windows is also not great, plugin package doesn't work, filesystem support assumes POSIX semantics, cgo requires installing mingw.

sigsev_251 · on March 16, 2024

> using LLVM backends

Wait, there is an LLVM based Go toolchain? I thought the Go crowd was known for their NIH obsession.

pjmlp · on March 16, 2024

TinyGo.

KingOfCoders · on March 16, 2024

Some see it as a compromising on types, I don't. After some years writing Scala code, trying to come up with even better types each day, Go to me is not a compromise but a relief.

My love for types peaked when I was in my mid-40s, now that I'm 50+ I want simple things.

pjmlp · on March 16, 2024

I am almost 50, and my point of view on Go from 2012 has hardly changed.

http://lambda-the-ultimate.org/node/4554#comment-71504

At least it does generics now.

sigsev_251 · on March 16, 2024

> At least it does generics now

Not in gccgo

pjmlp · on March 17, 2024

Indeed, as you see on other remarks from me I am aware of it, my point was about the language as designed, not the implementations ecosystem.

TinyGo also doesn't do them.

margorczynski · on March 16, 2024

From my experience this simplicity is something you pay the price along the way - development is harder (a good typing gives you a lot of hints about functionality and puts bounds on developers on how to use it) and more error prone (less stuff gets caught by the type checker, instead you find it out in runtime).

Of course it is good to be reasonable - some people completely fly off into the FP world and instead of actually building working stuff they think all day about some clever abstraction and types to model it.

naasking · on March 16, 2024

I'd call that disillusionment with bad type systems, which are indeed unnecessarily complex. We have yet to achieve a typing "nirvana", but we're getting closer IMO.

makapuf · on March 16, 2024

That is definitely true, but from the article it's the backend that takes time, not the frontend where the language itself resides. If you compile go from llvm, it maybe as long as rust.

GrumpySloth · on March 16, 2024

It’s not just Rust. Practically every compiler based on LLVM is slow. Swift, Zig, Clang.

The Go compiler being written from scratch based on the Plan9 C compiler is a huge advantage.

txdv · on March 16, 2024

Zig is moving away from LLVM. Its already has its own backend targeting debug builds for x86 and arm. ReleaseFast and ReleaseSmall is an entire different beast, but its going to be tackled eventually.

kjksf · on March 16, 2024

Minor correction: Go compiler used to be a modification of Plan9 C compiler, then a Go port of that modification but then it was completely rewritten as a SSA-based compiler so today it has almost nothing to do with the original Plan9 code.

rvdca · on March 16, 2024

I will chime in to day that rust is also building a alternative to the LLVM backend in the form of cranelift : https://github.com/rust-lang/rustc_codegen_cranelift

KingOfCoders · on March 16, 2024

I wonder why we do not split up compilation more - especially for web developer. Rust does this a little with "check", C with "-O".

I want fast compilation for my dev cycle or for unit tests, I want slow compilation with optimizations, escape analysis, correctness etc. for production (the distinction between a compiler and linter is also not clear, some compiles do what linters do in other languages).

jokethrowaway · on March 16, 2024

We do, you have profile configuration for dev or for release

You can get pretty big differences in terms of compile speed / binary size

IshKebab · on March 16, 2024

Giving up entirely on the compile time of "performance" builds can be bad too though, e.g. for people writing games, audio software, etc.

bluGill · on March 16, 2024

Only if you need to test the full game. If you can unit test algorithms and learn something then fast builds matter.

anonyfox · on March 16, 2024

second that. also another point: I write most Go code using only the stdlib, so there is no dependency web to take care on top of the actual code.

KingOfCoders · on March 16, 2024

I also have much less dependencies with Go compared to other languages, especially in TS.

davidhyde · on March 16, 2024

If you make a small change to your application, the Rust compiler does a significant amount of rework. That is, it recompiles a lot of code that it has already compiled before. There are valid technical reasons for this because of how LLVM works or that the linker needs to rewrite all addresses. Yes, incremental compilation is a thing but it’s too coarse IMO. To me it seems that taking an extremely fine grained approach to compilation would improve the ergonomics of the iterative hack-and-run method of writing software. Some sort of local database of diffs or some such.

makapuf · on March 16, 2024

Depends on your target; if you have tiny compilation units you won't be able to optimize /inline on a broad target, that's why single unit compilation is an option (that may or may not improve the resut)

davidhyde · on March 16, 2024

This is true but what is really happening is that the frontend and backend cannot communicate intent effectively because of they way they are separated. The frontend doesn’t know what is important for optimisation because that’s not its job and the backend only sees the code the front gives it (never a wholistic view). So the easiest (and slowest) approach is to do everything over and over again.

Increasing Codegen units (multi unit compilation) is just the user taking a risk that splitting things up will not affect performance optimisations. Nothing smart about it.

If you had tiny compilation units and the frontend understood their significance to the backend then it would be able to build a graph of dirty code to be recompiled when a small piece of it changes.

yxhuvud · on March 16, 2024

One problem they really need to address though is that as soon as you put everything in a single compilation unit, then compilation is single threaded and dog slow. Compilation units maybe make sense for C and C++, but for other languages they are just a way to structure the compilation. It should be automatic, and just better all around.

hun3 · on March 17, 2024

As a workaround, split into multiple crates.

Disclaimer: I concur that this is merely a workaround, not the real fix

jokethrowaway · on March 16, 2024

Over the years compile speed improved quite a bit (recently this: https://blog.rust-lang.org/2023/11/09/parallel-rustc.html made quite the difference)

If we can squeeze more performance that's great but the largest concern I have around compilation is with the size of the target directory

It can balloon up to node_modules levels

chrismorgan · on March 16, 2024

I don’t find node_modules exceeding even a few hundred megabytes very often. But Rust target directories can easily reach multiple gigabytes.

pornel · on March 16, 2024

Apart from incremental compilation cache, a large chunk of it is debug information. Lowering debug info precision helps a lot (although it’s still suspiciously large.)

kobzol · on March 16, 2024

There have been some recent improvements to this, but yeah, it can be still quite large. There is a WIP development of a garbage collector in Cargo that could help with this.

yu3zhou4 · on March 16, 2024

FWIW Jakub has more blog posts on his website which are also really interesting https://kobzol.github.io/

nbittich · on March 17, 2024

Build time should be instantaneous, just like in go or C. Instead, simple hello world in bevy / egui, can takes forever to build. Even after the first build, build time is noticable for every change . You already have to struggle with the borrow checker, adding the build time to that makes rust the worst dev experience for anything that requires adding dependencies to your project. I've been using rust for 3 years as my main programming language for hobby projects, but now I've decided to switch to C and Go until there is relevant improvements in Rust.

ivanjermakov · on March 17, 2024

I would love to see more categorized frontend breakdown. Just type check, borrow check and metadata don't show the full picture.

AndrewDucker · on March 16, 2024

Could LLVM be speeded up by passing it the data in a more efficient structure? Or by slimming down the data it's passed in some way?

Sharlin · on March 16, 2024

It’s well known that the LLVM IR bytecode generated by rustc is terribly verbose – Rust constructs mostly get lowered very naively to a buttload of redundant IR, to be then pruned and condensed by the backend. This is by design, as it helps keep the frontend as simple and fast as possible, but there are certainly cases where it would be a net benefit to move some of the complexity to rustc in order to lighten LLVM’s workload.

binary132 · on March 16, 2024

I'll never forget my first experience with Rust: trying to build the compiler from source and OOMing my cloud VM. :)

hobs · on March 16, 2024

Good article, just throwing out there that flamegraphs would be exactly what you need for visualizing this stuff.

foota · on March 16, 2024

It looks like the profile they're build on already supports those, I think the intention here is to present a sort of at a glance view that could be quickly analyzed.

hobs · on March 16, 2024

Makes sense. In my opinion I would just generate fake flamegraphs then - they are much more readable and the format is dummy easy to fake.

kobzol · on March 16, 2024

Yeah, I actually generated these small charts out of a flamegraph, because it contains too much information and isn't easily split into three distinct parts. And once you condense the information into just 3 blocks, then using a flamegraph doesn't really add any further value, IMO.

hobs · on March 17, 2024

/shrug that's fair, with only like four things its not that much more readable.

xmcqdpt2 · on March 16, 2024

For libraries, the article shows most of the time being spent in "front end" phases. Isn't that a bit misleading, as the library will eventually have to be included in the final program's binary? The code generation phase isn't exactly attributable to any one module.