Do I understand correctly that multi model mean you can create relational dbs, d...

controversy · on Aug 29, 2020

Let’s play make a Facebook. You need profile information. That’s information that’s access as block. We can use documents for that. We then want to track relationships. Friends. Friends of friends. We can use a graph. We might need a lightweight cache. Opaque entries accessed by key. We can use a key value store for that. ArangoDB does all of theses. Some times you want to join documents to documents or any other form of pairing. ArangoDB does that too.

You can then scale this across multiple machines as necessary. The benefit of such a design is that your team only needs to learn on technology not many. You don’t need to know redis, postgres and Neo4j to derive the same benefits.

zaphirplane · on Aug 30, 2020

Isn’t a graph dB a super set of a document dB Node == document Properties = attributes Edge == relationship

jlokier · on Aug 30, 2020

At a high level, you could say that you can model your data in either, so either can implement the other, and you can also include relational DBs in that too. They are all "equivalent" in an abstract sense. But it doesn't mean they support all uses equally well.

A graph DB is optimised for a traversing a general graph structure, whereas a document DB is optimised for a tree-structured document and sometimes queries can't traverse links between documents.

Optimised means performance, layout in storage (so locality, retrival and join patterns), the kinds of query operators that are offered, and even that the language they use is more suited to different ways of modelling data.

zaphirplane · on Aug 30, 2020

once you’ve implemented a graphDB you have a document DB. In a graphDB you may query all vertices with a property x=foo, which translates to get all documents with field x=foo

Effectively you can market a graphDB as a document DB, the reverse isn’t true. What am I missing

jlokier · on Aug 31, 2020

You're missing that the documentDB will be faster and simpler to use for some kinds use cases, is simpler to understand in some ways, and that the query/update language used by the documentDB will funnel application design towards storage and access patterns that work better with a documentDB.

Of course you can implement a documentDB on top of a graphDB, or market the latter as the former. And of course there are applications running on a documentDB that would be as fast or faster on a graphDB.

The differences are one of "impedance mismatch" rather than insurmountable differences.

For example, if you query all vertices with a property x=foo, then query all properties of the vertices, and then traverse all tree-child like properties to more vertices, and continue doing this recursively, that query will be like getting all documents with x=foo. But that's more complicated to express in a graphDB QL than a documentDB QL, and likely to run slower on the graphDB (due to data non-locality) if there are many properties or much tree depth.

In general a documentDB stores all the data for a document clustered together without being told to, and likes to retrieve them as a unit. Because that structure is clear, applications tend to be designed around it as an assumption.