Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd argue the causality goes the other way. Microcode allows you to do complex things on top of a very minimal execution engine. You might use the same ALU for an add as you would for a multistage multiply as you would for a computed jump, etc...

But as CPUs got bigger and suddenly had enough hardware to do "everything" at once, they had the problem of circuit depth. Sure, you can execute the whole instruction in one cycle but it's a really LONG cycle.

You fix that problem with pipelining ([R]eally [I]nvented by [S]eymore [C]ray, of course).

But you can't pipeline a complicated microcoded instruction set. Everything that happens has to fit in the same pipeline stages. So, the instruction set naturally becomes "reduced".

Basically: RISC is the natural choice once VLIW gets rolling. It's not about simplification at all, it's about exploiting all the transistors on much more "complicated" chips.



Except older CPUs already sequenced heavily within an instruction. Even more than you might think. Z80 for instance only had a 4bit ALU and would pump it many times to get the bit width required. Early 808x would be 5 or so cycles per instruction on average. Internally though, their microcode issued once a cycle typically.

> But you can't pipeline a complicated microcoded instruction set. Everything that happens has to fit in the same pipeline stages. So, the instruction set naturally becomes "reduced".

That's what they said in the early 80s, but then the 486 came out. AFAIK the longest pipelined general purpose systems were also fairly heavily microcoded (Netburst).


The 80486 was only minimally pipelined, and in fact if you squint it fits what would later become the standard model: where an "expansion" engine at the decode level emits code for the later stages (which look more like a RISC pipeline with separated cache/execute/commit stages). That engine is still microcoded (because VLIW might have been rolling but no way can you do a uOp cache in 2M transistors), and still limited to multicycle instruction execution for all but the simplest forms.

Basically, if you were handed the transistor budget of an 80486 and told to design an ISA, you'd never have picked a microcoded architecture with a bunch of addressing modes. People who did RISC in those chip sizes (MIPS R4000, say) were beating Intel by 2-3x on routine benchmarks.

Again: it was the budget that informed the choice. Chips were bigger and people had to figure out how to make 2M transistors work in tandem. And obviously when chips got MUCH bigger it stopped being a problem because dynamic ISA conversion becomes an obvious choice when you have 200M transistors.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: