i mean… your work also went into the training set, so it's not entirely surprisi...

underdeserver · 2026-02-05T21:59:51 1770328791

Anthropic's version is in Rust though, so at least a little different.

ndesaulniers · 2026-02-05T22:22:35 1770330155

There's parts of LLVM architecture that are long in the tooth (IMO) (as is the language it's implemented in, IMO).

I had hoped one day to re-implement parts of LLVM itself in Rust; in particular, I've been curious if we can concurrently compile C (and parse C in parallel, or lazily) that haven't been explored in LLVM, and I think might be safer to do in Rust. I don't know enough about grammers to know if it's technically impossible, but a healthy dose of ignorance can sometimes lead to breakthroughs.

LLVM is pretty well designed for test. I was able to implement a lexer for C in Rust that could lex the Linux kernel, and use clang to cross check my implementation (I would compare my interpretation of the token stream against clang's). Just having a standard module system makes having reusable pieces seems like perhaps a better way to compose a toolchain, but maybe folks with more experience with rustc have scars to disagree?

jcranmer · 2026-02-06T01:30:57 1770341457

> I had hoped one day to re-implement parts of LLVM itself in Rust

Heh, earlier this day, I was just thinking how crazy a proposal would it actually be to have a Rust dependency (specifically, the egg crate, since one of the things I'm banging my head against right now might be better solved with egraphs).

ndesaulniers · 2026-02-06T22:50:51 1770418251

Guess I better add https://github.com/bytecodealliance/rfcs/blob/main/accepted/... to my reading list!

yoz-y · 2026-02-05T23:39:54 1770334794

One thing LLMs are really good at is translation. I haven’t tried porting projects from one language to another, but it wouldn’t surprise me if they were particularly good at that too.

andrekandre · 2026-02-06T11:33:31 1770377611

as someone who has done that in a professional setting, it really does work well, at least for straightforward things like data classes/initializers and average biz logic with if else statements etc... things like code annotations and other more opaque stuff like that can get more unreliable though because there are less 1-1 representations... it would be interesting to train an llm for each encountered new pattern and slowly build up a reliable conversion workflow

rwmj · 2026-02-05T22:06:29 1770329189

It's not really important in latent space / conceptually.

D-Machine · 2026-02-06T02:01:12 1770343272

This is the proper deep critique / skepticism (or sophisticated goal-post moving, if you prefer) here. Yes, obviously this isn't just reproducing C compiler code in the training set, since this is Rust, but it is much less clear how much of the generated Rust code can (or can not) be accurately seen as being translated from C code in the training set.

GaggiX · 2026-02-05T21:59:21 1770328761

Clang is not written in Rust tho

underdeserver · 2026-02-05T22:00:07 1770328807