OK. I *totally* prefer ordered JSON, because it is so much easier to eyeball - t...

crazygringo · on Aug 16, 2019

+1 for all reasons above for ordered JSON, highly convenient in practice.

And if it doesn't impact performance significantly, these are all pretty good reasons for JSON outputters to default to sorting objects deterministically by keys, or at least to provide a flag to do so. (Even if there's no canonical sort order between JSON libraries, all that matters is it's deterministic for each library.)

BUT... I can't imagine any scenario where you'd want to validate that JSON content was ordered on the input, which is what was enabled in this article. Why does IBM even have that as an option?!

Be strict in what you emit and liberal in what you accept, and all that...

hyperpallium · on Aug 16, 2019

I guess because OrderedJSONObject is ordered, not sorted. Like java's LinkedHashSet, it maintains the order keys are added.

If you're going to rely on a specific order for comparisons, it makes sense to alert the user to any JSON in a different order (instead of silently, liberally accepting it), or you'll get false negatives elsewhere. Easier to check for a sorted order, but also possible to define a specific order. IDK what IBM did here.

funfact: jq used to sort keys; now it retains ordering.

diroussel · on Aug 16, 2019

Indeed, and remember this JSON message is going to a mainframe. Mainframes don't have much memory and typically process record-by-record, or event-by-event. So the implemenations probably streams the JSON in and constructs COPYBOOK from the payload before continuing to invoke the cobol.

So the rework time might be to write a general purpose re-order layer that can re-order any imcoming message.

tantalor · on Aug 16, 2019

> any scenario where you'd want to validate...

Because it's a precondition for something else down the line (a dependency)

lilyball · on Aug 16, 2019

Please XOR your hashes instead of adding them! If you add them, you're losing bits on the low end. EDIT: No you're not. It feels like you should be, but with unsigned overflow, this actually works just fine.

This assumes of course that you're using proper hashes that make use of the full domain of the output type (a proper hash will have a 50% chance of any arbitrary bit being flipped by any change to the input). But if you're not using proper hashes, you're doing something wrong.

mlyle · on Aug 16, 2019

Can you please explain this assertion? ;)

If I have a 32 bit current hash value-- for any possible 32 bit value I add, I get a different 32 bit value out.

XORing is effectively adding each bit and throwing away the carry bit. Adding just cascades carries to the left.

lilyball · on Aug 16, 2019

You know what, you're right. I made a knee-jerk comment but I didn't think it through all the way. From any arbitrary unsigned 32-bit integer, every other unsigned 32-bit integer is reachable with a single addition. Therefore addition works just fine here.

It still feels wrong to say this, it feels like since adding will effectively shove bits off the high end and drop them on the floor that you're losing information, but I can't actually justify that feeling with reasoning.

mlyle · on Aug 16, 2019

Adding is actually considerably better. For high quality hashes XOR is just as good; but if there's any distributional problems at all in the hash, adding mixes stuff more.

(XORing is effectively adding with all of the carry information lost/falling off).

atq2119 · on Aug 16, 2019

Please note that using a sum of hashes almost certainly weakens any cryptographic guarantees you may expect from your hashes. Of course, that may be fine depending on your use case.