Can someone explain to me why JSON can't have comments or trailing commas? I really hope the performance gains are worth it, because I've lost 100s of man-hours to those things, and had to resort to stuff like this in package.json:
"IMPORTANT: do not run the scripts below this line, they are for CICD only": true,
It can't have comments because it didn't originally had comments, so now it's too late. And it originally didn't have comments, because Douglas Cockford thought they could be abused for parsing instructions.
As for not having trailing commas, it's probably a less intentional bad design choice.
That said, if you want commas and comments, and control the parsers that will be used for your JSON, then use JSONC (JSON with comments). VSCode for example does that for its JSON configuration.
JSONC also supports trailing commas. It is, in effect, "JSON with no downsides".
TOML/Yaml always drive me batty with all their obscure special syntax. Whereas it's almost impossible to look at a formatted blob of JSON and not have a very solid understanding of what it represents.
The one thing I might add is multiline strings with `'s, but even that is probably more trouble than it's worth, as you immediately start going down the path of "well let's also have syntax to strip the indentation from those strings, maybe we should add new syntax to support raw strings, ..."
Unfortunately, JSON5 says keys can be ES5 IdentifierName[1]s, which means you must carry around Unicode tables. This makes it a non-option for small devices, for example. (I mean, not really, you technically could fit the necessary data and code in low single-digit kilobytes, but it feels stupid that you have to. Or you could just not do that but then it’s no longer JSON5 and what was the point of having a spec again?)
Or take the SQLite route, if you don't need to reject strictly invalid JSON5. When they extended SQLite's JSON parser to support JSON5, they specifically relaxed the definition of unquoted keys (in a compatible way) to avoid Unicode tables[1]:
> Strict JSON5 requires that unquoted object keys must be ECMAScript 5.1 IdentifierNames. But large unicode tables and lots of code is required in order to determine whether or not a key is an ECMAScript 5.1 IdentifierName. For this reason, SQLite allows object keys to include any unicode characters greater than U+007f that are not whitespace characters. This relaxed definition of "identifier" greatly simplifies the implementation and allows the JSON parser to be smaller and run faster.
Or take the XML route. Unicode offers several strategies to avoid continuously updating the Unicode table [1] which has been adapted by XML 1.1 and also later editions of XML 1.0. The actual syntax can be as simple as the following (based from XML, converted to ABNF as in RFC 8295):
In my opinion, this approach is so effective that I believe every programming language willing to support Unicode identifiers but nothing more complex (e.g. case folding or normalization or confusable detections) should use this XML-based syntax. You don't even need to narrow it down because Unicode explicitly avoided identifier characters outsides of those ranges due to the very existence of XML identifiers!
Yeah, definitely for small devices. For things like VS Code's configuration file format (the parent comment) or other such use cases, I don't see a problem
If you want commas and comments, another term to search for is JWCC, which literally stands for "JSON With Commas and Comments". HuJSON is another name for it.
It’s not that it “can’t”, more like it “doesn’t”. Douglas Crockford prioritized simplicity when specifying JSON. Its BNF grammar famously fits on one side of a business card.
Other flavors of JSON that include support for comments and trailing commas exist, but they are reasonably called by different names. One of these is YAML (mostly a superset of JSON). To some extent the difficulties with YAML (like unquoted ‘no’ being a synonym for false) have vindicated Crockford’s priorities.
There's no real reason for that. It just happened like that. These aren't the only bad decisions made by JSON author, and not the worst either.
What you can do is: write comments using pound sign, and rename your files to YAML. You will also get a benefit of a million ways of writing multiline strings -- very confusing, but sometimes useful.
I don't know the historic reason why it wasn't included in the original spec, but at this point it doesn't matter. JSON is entrenched and not going to change.