Fair enough. Would the problem be fixed if the C standard said that x (without spaces) had to be dereferencing and that x y (with spaces) had to be multiplication?
Yeah, if you could purely lexically distinguish dereferencing from multiplication, either because they used a different symbol, or had different mandatory whitespace rules, then there'd be no ambiguity, at least in this example.
That's basically what C++ did by requiring the space in:
Foo<Bar<Baz> > var;
to make it easy to lexically distinguish the '>>' right-shift operator from the '> >' sequence of two successive template-parameter closing symbols.
C++0x is changing that though, due to the unpopularity of making programmers accomodate what looks like a parser-implementation hack.
That's not a syntactic ambiguity. Typically (when you're not trying to compress multiple passes, be clever and fast etc.), identifiers are not resolved during parsing, so it doesn't matter whether x denotes a type or a variable when the sizeof operator is applied.
You can write "A * B" and depending on nature of A, that would be a declaration of B as pointer to type B, or multiplication of A and B. You could say that this multiplication would be 'void' (as in 'not assigned to anything') and I could come up with even more complicated example, like "A * B();" where this is either function declaration or multiplication of A and function named B() with side-effect let's say. But that's not the point: if parser has to do heuristics like that to parse language properly, it is already context-dependent or at least not LALR(0) or LALR(1).
I think it does matter, although my example fails to make clear why. Type identifiers have different syntactic requirements to regular identifiers (the parens in sizeof(typename) are required, but the parens in sizeof(varname) are optional).
Yes, but now you're getting deeper into requirements for context-sensitive grammar; go far enough, and you might as well go all Van Wijngaarden. A context-free parse is free to create an AST like (sizeof (parens (ident "x"))) for one, and (sizeof (ident "x")) for the other, and disambiguate based on the symbol table lookup of "x" later.