Hah, linking to OWASP is a good point. I have now put some more thought into "how would I EXACTLY classify this, if I had to list it as a finding for a customer".
Now, I think format string is not entirely right, either. That's more related to `printf`, which splits code (the format string) and data (the varargs). But log4j actually mixes code and data into one string. In the OWASP frame work, I think the more generic variant of format string attacks would be https://owasp.org/www-community/attacks/Code_Injection (that's actually linked under "related" for the format string attacks).
OWASP lives mostly in the web world, but links to CWE-77 (Command Injection, https://cwe.mitre.org/data/definitions/77.html), which is pretty generic. And log4shell matches the description just nice: 1. Data from untrusted source: yes; 2. data is part of a string that's executed: interpreted, which I think qualifies as a yes; 3. the execution gives capabilities the attacker would not have other wise: oh yes!
So it's probably safe to claim that this could fall under CWE-77 (too bad CVEs rarely use CWEs).
Now, CWE-77 (Command Injection) is a child of "CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')" https://cwe.mitre.org/data/definitions/74.html. Which states "The most classic instantiations of this category of weakness are SQL injection and format string vulnerabilities." - that's probably why the both of us thought of the the two of these! :) And even if CWE-77 is to specialized (this could be argued) CWE-74 should be a good match, quoting CWE-74: "The software constructs all or part of a command, data structure, or record using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify how it is parsed or interpreted when it is sent to a downstream component" [end quote].
Funnily, CWE-74 is a child of https://cwe.mitre.org/data/definitions/707.html - "Improper Neutralization". Which says neutralization can be done by (among others): "[...] transformation of the input/output to be "safe" using techniques such as filtering, encoding/decoding, escaping/unescaping, quoting/unquoting, or canonicalization [...]"
Now, I think format string is not entirely right, either. That's more related to `printf`, which splits code (the format string) and data (the varargs). But log4j actually mixes code and data into one string. In the OWASP frame work, I think the more generic variant of format string attacks would be https://owasp.org/www-community/attacks/Code_Injection (that's actually linked under "related" for the format string attacks).
OWASP lives mostly in the web world, but links to CWE-77 (Command Injection, https://cwe.mitre.org/data/definitions/77.html), which is pretty generic. And log4shell matches the description just nice: 1. Data from untrusted source: yes; 2. data is part of a string that's executed: interpreted, which I think qualifies as a yes; 3. the execution gives capabilities the attacker would not have other wise: oh yes!
So it's probably safe to claim that this could fall under CWE-77 (too bad CVEs rarely use CWEs).
Now, CWE-77 (Command Injection) is a child of "CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')" https://cwe.mitre.org/data/definitions/74.html. Which states "The most classic instantiations of this category of weakness are SQL injection and format string vulnerabilities." - that's probably why the both of us thought of the the two of these! :) And even if CWE-77 is to specialized (this could be argued) CWE-74 should be a good match, quoting CWE-74: "The software constructs all or part of a command, data structure, or record using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify how it is parsed or interpreted when it is sent to a downstream component" [end quote].
Funnily, CWE-74 is a child of https://cwe.mitre.org/data/definitions/707.html - "Improper Neutralization". Which says neutralization can be done by (among others): "[...] transformation of the input/output to be "safe" using techniques such as filtering, encoding/decoding, escaping/unescaping, quoting/unquoting, or canonicalization [...]"
:) Thanks for triggering me on this.