Maybe I'm naive, but I find these re-engineering complex product posts underwhel...

bs7280 · 2026-02-05T21:42:46 1770327766

I don't see this as just exercise in making a new useful thing, but benchmarking the SOTA models ability to create a massive* project on its own, with some verifiable metrics of success. I believe they were able to build FFMPEG with this rust compiler?

How much would it cost to pay someone to make a C compiler in rust? A lot more than $20k

* massive meaning "total context needed" >> model context window

stephc_int13 · 2026-02-05T21:41:59 1770327719

This is a nice benchmark IMO. I would be curious to see how competitors and improved models would compare.

NitpickLawyer · 2026-02-05T21:55:27 1770328527

And how long will it take before an open model recreates this. The "vibe" consensus before "thinking" models really took off was that open was ~6mo behind SotA. With the massive RL improvements, over the past 6 months I've thought the gap was actually increasing. This will be a nice little verifiable test going forward.