30% Faster Rust Build Times using Cranelift instead of the LLVM

Ericson2314 · on Nov 17, 2020

I believe between

- this

- https://github.com/rust-lang/compiler-builtins

It finally possible to make Linux binaries in a high level language not involving anything written in C at build-time or run-time!!!

(And I think with CraneLift + -Z build-std, but without any libc, one should be able to skip linking separately-compiled object files too, and thus skip LLD and C++.)

est31 · on Nov 18, 2020

Haven't the go folks achieved this as well? IIRC their entire toolchain is written in Go.

pjmlp · on Nov 18, 2020

It was bootstraped via the Plan 9 compilers.

Go has multiple implementations, only the reference one is pure Go, gogcc and the llvm variants are naturally a mix of Go and C++.

est31 · on Nov 18, 2020

Yeah I was referring to the reference toolchain. Not very deep in the go ecosystem, but I presume it's the one which has most use?

Regarding the bootstrapping point, rustc was bootstrapped via a "rustboot" compiler written in ocaml. Doesn't make the rust compiler any less rust now, as it compiles itself nowadays (to be precise, version N compiles version N+1, and also supports compiling version N, at least for bootstrap purposes).

pjmlp · on Nov 18, 2020

My point was more to make it clear that language != implementation, hence the various ones for Go.

est31 · on Nov 18, 2020

Indeed, those two aren't the same concepts, but the person I replied to said "It finally possible to make Linux binaries in a high level language not involving anything written in C at build-time or run-time", so mainly referring to the possibility instead of all implementations being that way. For go I didn't want to go into further detail then as I assumed the "default go experience" entails only the official Go toolchain.

pjmlp · on Nov 18, 2020

Fair enough.

stefan_ · on Nov 18, 2020

Isn't that just Go? They always had their own standard library (which they would never update to use newer syscalls).

Ericson2314 · on Nov 18, 2020

Parts of Go's runtime are written in C, so the end-to-end being single-language is emphatically not like Go.

relibc is a genuine libc implementation. However, I do believe Rust on Redox might indeed not factor through relibc's C interface for everything, taking advantage of Rust being on both sides. That is like Go.

jchw · on Nov 18, 2020

> Parts of Go's runtime are written in C,

?

You can build the Go toolchain (and many other Go apps, so as long as they do not hard depend on C) with CGo disabled. I do not believe this is true.

Source, although it is self-evident for Go developers who dabble in cross-compilation. https://golang.org/doc/faq#What_compiler_technology_is_used_...

(I do think there are optional C dependencies, but that shouldn’t be important; you can go end-to-end without C with exceedingly few compromises.)

Ericson2314 · on Nov 18, 2020

https://stackoverflow.com/questions/34756262/how-is-it-possi...

Wow, OK. Go's GC is in Go. I'm very jealous; I've wanted this for Haskell for a long time.

pjmlp · on Nov 18, 2020

And here is the source code for the Oberon's one, in Oberon.

https://people.inf.ethz.ch/wirth/ProjectOberon/Sources/Kerne...

Or the JikeRVM implementation in Java.

https://github.com/JikesRVM/JikesRVM/tree/52582fa1c519d48fe0...

thayne · on Nov 18, 2020

ldd tells me go depends on libc and libpthread (at least on linux), so it still needs c code at runtime.

jchw · on Nov 18, 2020

All you have to do is build the toolchain with CGO_ENABLED=0. The optional CGo bits are not needed for anything that the compiler cares about I don’t think; IIRC with CGo enabled it will use the system’s DNS resolver on Linux for example, instead of reading resolv.conf and talking to DNS servers directly.

gpm · on Nov 17, 2020

> at build-time

That's going to be hard.

You're going to need an operating system, redox might work, not sure if it can run rustc yet.

Once you have the operating system you need the hardware, specifically the hardware not to be using C. I'm not sure how I would even go about testing that but even if I had a magic wand I don't think it would be easy to cobble together a computer... and I can't even use particularly exotic hardware because of the limited OS selection.

est31 · on Nov 18, 2020

Yeah also cargo has some C dependencies. It uses curl to download stuff from the network, openssl for TLS connections (at least on linux), and libgit2 to clone git repositories, both needed for crates.io downloads (registry is a git repo) as well as for git dependencies in Cargo.toml.

But I think the progress is still very important. It shows that Rust can do basically anything. Also, for the c dependencies Cargo uses, rust rewrites exist. reqwest, rustls, and various (currently highly immature) pure-rust git libraries like gitoxide/https://github.com/chrisdickinson/git-rs.

pjmlp · on Nov 18, 2020

That would be easy, just pick an IBM or Unisys mainframe, home of the systems programming languages created before C was born, and still used.

Ericson2314 · on Nov 18, 2020

relibc supports Linux too, not just Redox.

gpm · on Nov 18, 2020

Linux is written in C , as is macos and I suspect portions of windows.

muricula · on Nov 18, 2020

Your suspicions are correct.

Ericson2314 · on Nov 18, 2020

I'm talking about the userland side of things.

If we want pure Rust at, Redox or some tiny embedded thing is already no-C at runtime. The big novelty here is that build time is getting improved.

titzer · on Nov 18, 2020

> It finally possible to make Linux binaries in a high level language not involving anything written in C at build-time or run-time!!!

Virgil has been doing this for years.

https://github.com/titzer/virgil/

ezluckyfree · on Nov 18, 2020

that repo is 13% C according to github though?

titzer · on Nov 18, 2020

GitHub doesn't understand that .v3 is a new programming language. Their policy is not to recognize new languages unless "hundreds" of repos are using them. Thus, it concludes my project is mostly bash. :-)

The C code in the repo is used for a few benchmarking and testing harnesses. It is not used in the runtime or compiler or libraries.

wyldfire · on Nov 18, 2020

"skip linking"

-- what's this part mean? Does cranelift have an option to directly make locally-resolved executables without external references somehow?

Ericson2314 · on Nov 18, 2020

Yes that is what I mean.

I don't know for sure, but because of Cranelift's origins as a JIT for wasm, I suspect that it does.

wyldfire · on Nov 18, 2020

It wouldn't surprise me if that is a wasm backend only option. It sounds very unconventional for other targets.

Ericson2314 · on Nov 18, 2020

To be clear, Cranelifts original purpose was JIT compiling WASM to machine code, not other things to WASM. Skipping linking isn't too radical: traditional compilers already effectively do it infra-procedurally for resolving labels to (relative) addresses.

"wasm backend" sounds like you were thinking the latter?

pmarin · on Nov 18, 2020

What about the linker?

pjmlp · on Nov 18, 2020

I have done that for my toy compiler assignment during the late 90's, there is nothing special about it.

ibraheemdev · on Nov 17, 2020

The Cranelift codegen backend for the rust compiler has been merged into the main Rust git repository! It is not yet distributed with rustup, but there are instructions on how to use it here [1]

[1] https://blog.rust-lang.org/inside-rust/2020/11/15/Using-rust...

pjmlp · on Nov 17, 2020

Thanks for sharing!

da_big_ghey · on Nov 17, 2020

It's nice to see an LLVM alternative. I get the appeal of putting all of the optimizations in one place so that everyone can use them, but the concerns over an LLVM monoculture are also valid. Now all that's missing is something to transform IR between LLVM and Cranelift, or vice-versa, to make them interchangeable.

dathinab · on Nov 17, 2020

Be aware that cranelift is not a LLVM replacement (and isn't meant to be one).

It's a fast but not-much-optimizing code gen which is complementing LLVM for the use-case where you want to fastly spit out code but don't care much about optimizing it, beyond "simple"/"basic" optimizations.

Besides that why do you think there should be LLVM IR to cranelift coversions? I just can't find any use-case for which is makes sense as it's always simpler and more efficient to generate cranelift IR from MIR instead of generating LLVM IR and then trying to convert it to cranelift.

atombender · on Nov 18, 2020

Is speed a specific goal of Cranelift? That is, can we expext that it will aim for fast compiles for ever, and not slowly accrue features and optimizations? From what I can tell, Cranelift was originally designed for Wasm, and that's still its primary purpose. That means that its goals might not be permanently aligned with that of Rust developers who desire a fast, non-optimizing backend.

dathinab · on Nov 19, 2020

Like simcop2387 said, a codegen backend which is focused on wasm will likely never be "made slow".

Wasm is in the end close to a form of high level assembly. Which means it is (should be) already preoptimized as much as needed/wanted/possible.

Cranelift still needs to do some optimizations, especially wrt. platform specific aspects like register allocations and similar.

In the end you don't want long load times when using wasm (at least in the web browser maybe in other places too).

Through maybe in the future you might have some options to opt-in to some more time intensive optimizations for e.g. in browser gaming. But then rust could just not opt-in.

simcop2387 · on Nov 18, 2020

It might get further optimizations, but the fact that it's intended for a WASM JIT does mean that it's going to have to be optimizing towards compile speed, and avoiding regressions there-of. It might not ever completely and only target compile speed but you probably wouldn't want it to either, instead you'd want it to be a reasonably low-latency compiler back-end that does some optimizations so that the resulting code can actually run at a reasonable speed.

gumby · on Nov 17, 2020

LLVM monoculture? People used to complain about the gcc monoculture, and LLVM hasn’t quite caught up with (the moving target of) gcc. Plus gcc has more front ends.

I use them both and appreciate their different strengths. There’s no question that the existence of llvm spurred gcc to improve a lot, and vice versa.

Ericson2314 · on Nov 17, 2020

"LLVM monoculture" as in LLVM being the backend for every niche language, not as in LLVM being the only way to compile C.

pjmlp · on Nov 18, 2020

Before that we had the C monoculture, where the same authors would be generating Assembly like C code (C--) and calling the compiler into it.

The most dangerous aspect of this is the cargo cult of not thinking out of the box regarding alternatives.

da_big_ghey · on Nov 17, 2020

I meant LLVM monoculture, not Clang monoculture. All the new languages use LLVM for a backend. (There are, of course, exceptions, but the point stands.)

Hello71 · on Nov 17, 2020

gcc has more front ends? really? https://gcc.gnu.org/frontends.html lists seven bundled frontends and eight third-party frontends, some of which "are very much works in progress", and none of which I have ever heard of anyone using. https://en.wikipedia.org/wiki/LLVM lists about twenty-five, most of which I have heard of.

woodruffw · on Nov 18, 2020

I believe almost none of the frontends listed in that Wikipedia section are included in the main llvm-project tree. To the best of my knowledge, is only stable in-tree frontend is Clang, with Flang (for Fortran) in-tree but considered WIP.

Clang, in turn, only supports C, C++, Obj-C, and Obj-C++, so that's 4 frontend languages.

Edit: But yes, as 'Gaelan notes, I think they meant compiler backends.

Hello71 · on Nov 18, 2020

that's a different project organization style though. it's like saying that bsd has so much more functionality than linux just because more stuff is bundled. maybe the bsd way makes better software, maybe not, but it would be insane to say that openbsd has "more web server functionality" than debian because openbsd comes with httpd but on debian you have to install one.

Gaelan · on Nov 17, 2020

Perhaps they mean backends?

ibraheemdev · on Nov 17, 2020

It looks like someone attempted a LLVM to Cranelift translator in 2019 [1]... but it is no longer maintained

[1]: https://github.com/sunfishcode/llvm2cranelift

swsieber · on Nov 18, 2020

Since cranelift is pure rust, that should simply PGO (mentioned this last week in https://news.ycombinator.com/item?id=25060762), which gave 16% for LLVM, hopefully the same for cranelift? So with those and not using the default linker, you could get significantly faster debug builds.

Exciting times :)

ncmncm · on Nov 18, 2020

Two more orders of magnitude and it could really be something!

But seriously: I wonder if it would help to generate the code needed to compile code that depends on a specific collection of crates, and then compile and run that. Something drastic needs to be done. Apologetics are not carrying the day.

hinkley · on Nov 18, 2020

I've been having a search fail trying to look up information on typical heap space consumed by the compiler.

At one point I recall people complaining that you couldn't build 32 bit Rust on the target machine because the compiler couldn't squeeze itself into a couple gigs of memory. I'm curious if we are still in cross-compiler territory there or if the last few generations of improvements have done much for that limitation.

ChrisSD · on Nov 18, 2020

Building llvm is the most memory intensive part of compiling Rust itself. You can, I believe, skip this and use pre-built llvm binaries.

strangemonad · on Nov 18, 2020

(llvm monoculture comments aside) I find it really interesting that llvm isn’t able to be used in a similar way and achieve similar speed gains for incremental debug builds. Especially since swift debug build times probably matter enough for Apple to want to have a similar fast path with an llvm tool chain

pjmlp · on Nov 18, 2020

Which is why they came up with SIL (Swift Intermediate Language), make use of their own bitcode extensions, and not everything that Apple does lands on upstream, thanks LLVM license.

est31 · on Nov 18, 2020

> thanks LLVM license.

To be fair, gcc requires contributors to assign copyright to the FSF or dedicate their work into the public domain. Neither of which is required by the GPL's copyleft so if Apple used gcc, they could still make changes that can't be upstreamed due to gcc's policy.

pjmlp · on Nov 18, 2020

Check how Steve Jobs was an happy fellow to contribute back Objective-C frontend to FSF, back in the early day of NeXT.

wyldfire · on Nov 17, 2020

QEMU talked about replacing some code with rust. I wonder if they would start with TCG or device code. Could be interesting to try an existing Rust codegen. What backends are implemented for cranelift?

JoshTriplett · on Nov 17, 2020

> I wonder if they would start with TCG or device code.

Device code, most likely. Using cranelift for TCG would be interesting, but right now the major concern is attack surface of emulated devices.

__s · on Nov 18, 2020

I recall cranelift was theorizing that it'd be able to compete with llvm's debug performance. Are there benchmark comparisons for the actual generated binaries' perf?

29athrowaway · on Nov 17, 2020

For rust, to me build time performance is not as important as runtime performance.

eikenberry · on Nov 18, 2020

I think this is one of those self-fulfilling kind of things. Rust is known to have glacially slow compile time perf with very fast runtime perf. Thus it would attract people who consider runtime performance more important than compiler performance and put off people who think the opposite.

tick_tock_tick · on Nov 17, 2020

This isn't mean to replace LLVM for production build but rather help all the large project with 10+ minute debug builds.

ibraheemdev · on Nov 17, 2020

My workflow is pretty much write some code, build, see the result, repeat. So for me, debug build times are a big deal.

pjmlp · on Nov 18, 2020

Having used plenty of AOT compiled languages during the last 30+ years, I disagree.

It is possible to have both.

imtringued · on Nov 18, 2020

It would be interesting to see both within the same project. Dependencies would be compiled with LLVM and the application would be compiled with cranelift. Through an intelligent project structure you would be able to get both fast compile times and a fast debug build.

pjmlp · on Nov 18, 2020

As an example of how it works both ways is Eiffel.

While you are developing you get a JIT based enviroment with the productivity workflow one knows from languages like Java and .NET (MELT VM).

Then when it comes to a proper release, you use the AOT compiler which uses C or C++ based compiler as part of the build workflow (Eiffel AOT compiler generates C or C++ code as its intermediate format).

So you get the quick edit-compile-debug workflow, and when feeling like doing a full release build just let the C or C++ compiler do its job.

There are other examples, I just choose Eiffel as one of thm.

tubs · on Nov 20, 2020

There was a language that started working on an entirely separate compiler toolchain for debug builds (prioritising compilation+linking speed) - I think it was zig but I can't remember exactly.

vijaybritto · on Nov 18, 2020

So this means the compilation will be even more faster in new M1 macs! Rust compiles pretty fast in the M1 with LLVM backend by a couple of orders of magnitude. This is going to increase that even further. I think it would save time to simply cross compile for other OS/arch from M1.

star-trek-fleet · on Nov 17, 2020

I know little about Cranelift, it seems to be mostly functionally equivalent to LLVM, but have a smaller footprint because of intentional design tradeoffs.

Could someone more enlightened share an overview of Cranelift's capability in handling deep learning relevant compiling tasks: gemm optimization, data flow, scheduling, GPU backend, etc?

Also, it seems Cranelift only has one IR format as the interface to the language frontend. Does it have things like MLIR to bridge diverse optimization needs of multiple different languages?

Rusky · on Nov 18, 2020

Cranelift is designed to do very little optimziation, in exchange for being very simple and fast compared to LLVM. Its use cases include debug builds of Rust code, JIT-ing WebAssembly (which is pre-optimized), etc.

So I would expect it to have very little overlap with deep learning. No high-level optimizations to do with matrices or dataflow or scheduling; currently no GPU backends; no integration with MLIR (though that one seems completely doable if someone put in the effort).

bobajeff · on Nov 18, 2020

I wonder if Cranelift would be useful for making a C++ REPL.

nly · on Nov 18, 2020

Is cling not good enough? (It disables most optimizations)

https://github.com/root-project/cling

ibraheemdev · on Nov 18, 2020

Hopefully it will be used as a backend for a Rust REPL.

The_rationalist · on Nov 18, 2020

The benchmarks were done before enabling split dwarf support for llvm. Once enabled it will allow to be as or faster than cranelift while keeping sane runtime performance. https://github.com/rust-lang/rust/pull/77117

kibwen · on Nov 18, 2020

I'm no expert, but that sounds like a speedup in the linker rather than in the code generator. I believe that the Cranelift numbers referenced here are using the system linker; using LLD should show even greater performance improvements (though it's still worth benchmarking). Furthermore it sounds like something that could be added to Cranelift as well.

swsieber · on Nov 18, 2020

I'd be surprised if it was as fast. Cursory googling indicates roughly a 15% time reduction. So cranelift is set to save twice as much time as split dwarf.

If you have other stats, I'd love to see them. I imagine that c++ numbers don't translate over into rust 1:1.