Your users will still need to manually merge in a CRDT application; because computers can not read minds.
The best a CRDT application can do is spit out garbage when changes conflict.
IE, start with: The quick brown fox jumped over the fence.
We both disconnect. At the same exact time...
I change it to: The quick brown fox ran around the fence.
You change it to: The quick brown fox dug under the fence.
The best you can do is make a data structure that is consistent and identifies the conflict. CRDT can't "read our minds" and decide between "ran around" or "dug under".
It can't decide human intent flawlessly, but the point of a CRDT is that it does choose one, and all others choose the same one regardless of how they got there.
Git does not do this, so it is not a CRDT. The content-addressable-database portion of git sorta fits this though (as does any other content-addressable system).
This is basically git automatically doing “accept theirs” or “yours” for any fork. You can see that it will not generally be what you want, so whether such a strategy could work is domain-dependent.
No that’s not accurate. If I merge branch X and then branch Y and someone else merges branch Y and then branch X, with CRDT the result should also be the same whereas with git it won’t be if you’re strategy is always “accept theirs” or “accept yours”. CRDT is also order invariant - it doesn’t matter which ordering of edit operations you accept, the end result is consistent across all nodes.
I wouldn't argue that CRDT "solves" this problem, either.
The git solution exposes the conflict to the user, who can then fix it. (Or leave it there if they choose to.)
The best a CRDT can do is leave some kind of conflict marker that the user can fix. (Remember, computers can't read minds. See my "quick brown fox" example.)
Git does this. It's predictable and lossless. Deciding if it's a CRDT probably is more of a discussion about semantics than fact, because "git merge" is lossless: It presents a consistent view that the user can accept or change.
But a CRDT would. The end result at node 1 seeing merge X first and then merge Y next would necessarily have to be the same as node 2 seeing merge Y first and then merge X. That’s literally the core property of CRDTs - all nodes eventually converge to the same state regardless of the network partitioning. Git does not have this property and thus is not a CRDT (for edits - it’s a CRDT for mirroring).
Git is not a CRDT not because “git merge is lossless” but because the result is order dependent which is not partition tolerant.
You may want to read the original paper which defines CRDT [1]. Here’s some choice quotes to help you:
> System model: We consider a system of processes interconnected by an asynchronous network. The network can partition and recover
> Clearly, a sufficient condition for convergence of an op-based object is that all its con- current operations commute. An object satisfying this condition is called a Commutative Replicated Data Type (CmRDT).
Git has some CRDT concepts but the core behavior of creating commits and sharing them does not generally meet the criteria of a CRDT. And no. Requiring a manual merge is also not a property of a CRDT as the whole point of it is to generate a “correct” merge result without human intervention. Otherwise the point of the paper would be almost irrelevant.
Whatever happens, the end result on two different nodes doing the same merge operations (or a commutative ordering of those merge operation) would be identical.
Think about a CRDT document: if two people edit the same line, regardless of what happens, once the documents synchronize, the final state of the document will be identical. That’s also the reason manually resolved merges don’t work because two different people might resolve the same conflict in different ways. But again, the conflict resolution being identical under any commutative ordering of simultaneous operations is the hardest requirement of CRDTs. The commutation requirement is what kills the “always theirs” or “always mine” strategy (there are other scenarios but that’s the easiest one to demonstrate).
Ahh, now you're missing some critical details: How can a CRDT perform a sane merge? (Remember my quick brown fox example.) IE, is it destructive (picks one) or does it output something like: "The quick brown fox !!!(ran around|||dug under)!!! the fence."
This is kind-of what git does: It leaves a sane conflict in your source code. (The result is always the same given the same inputs, too.) The merge conflict might not build; but how git handles merge conflicts will always result in a functioning git repository.
tbh it's increasingly sounding like you're defining a CRDT as "something is decided and written down in all cases" and simply ignoring every single other quality they guarantee.
Those other qualities matter. So much so that they're literally the defining qualities.
Yeah I'm done trying to help this person understand the differences between Gits and CRDTs. They're being intentionally difficult by redefining CRDTs to "what Git does" rather than evaluating Git against the properties a CRDT is defined to have.
Whether or not the user is manually involved at some point is a product decision. I think the trend of consumer companies is not to do that. Possibly damaging user data is simply a tradeoff in this way of thinking.
The best a CRDT application can do is spit out garbage when changes conflict.
IE, start with: The quick brown fox jumped over the fence.
We both disconnect. At the same exact time...
I change it to: The quick brown fox ran around the fence.
You change it to: The quick brown fox dug under the fence.
The best you can do is make a data structure that is consistent and identifies the conflict. CRDT can't "read our minds" and decide between "ran around" or "dug under".