Unifying identifiers in the lexer doesn't solve the problem. The problem is gett...

Unifying identifiers in the lexer doesn't solve the problem. The problem is getting the parser to produce a sane AST without needing information from deeper in the pipeline. If all have is `foo * bar;`, what AST node do you produce for the operator? Something generic like "Asterisk", and then its child nodes get some generic "Identifier" node (when at this stage, unlike in the lexer, you should be distinguishing between types and variables), and you fix it up in some later pass. It's a flaw in the grammar, period. And it's excusable, because C is older than Methuselah and was hacked together in a weekend like Javascript and was never intended to be the basis for the entire modern computing industry. But it's a flaw that modern languages should learn from and avoid.

C ain't simple, it's an organically complex language that just happens to be small enough that you can fit a compiler into the RAM of a PDP-11.