We have been using mongo over the last three years on our project and it has been pretty smooth so far. However, lately I gave some thoughts into what our project would look like if we used PostgreSQL instead. I tried to figure out what problems Mongo solves that PostgreSQL doesn't.
I am far from being a database expert, I just know enough basics to query what I need, so feel free to correct/complete the following:
- Mongo has been built to store json objects -> Yes, but from what I understand benchmarks indicate that PostgreSQL is faster at reading/storing/indexing json/jsonb content. I don't think that it is good reason to use it.
- Mongo is schemaless -> There might be some usecases, but I bet in most cases this problem can be worked around. Especially in a database with JSONB support.
- MongoDB horizontal scaling is way easier than PostgreSQL. Yes, it seems that scaling horizontally Mongo is extremely easy compared to any other relational database.
And ... that's it. But there is probably more.
At the moment, here's how I would summarize MongoDB benefits if asked my opinion when starting a project:
- For a small projects or a prototype: ease of use, ease of configuration, don't require too much thinking into my data model while I am experimenting
- For a bigger project: horizontal scaling should be easier
Does that sound accurate to you? Am I missing anything important?
Honestly, the existence of Mongo is mostly an indictment about how user-hostile conventional RDBMS is. The fact that other DBs can do what Mongo does is not helpful when there is no easy workflow to do what Mongo does.
The fact that I theoretically implement a web-based CMS in C and it could be more performant than all these web-language CMS products doesn't mean that C is better for making a CMS.
> "Honestly, the existence of Mongo is mostly an indictment about how user-hostile conventional RDBMS is."
That's nonsense. I've taught people to use SQL before, even people with little to no programming experience. After the initial concepts were understood it was fairly easy to gradually expand knowledge over time.
The basic SQL keywords to start getting useful information out of a RDBMS are:
SELECT
FROM
WHERE
INNER JOIN
LEFT JOIN
ON
AS
AND
OR
Those 9 keywords give you a good starting point for exploring SQL. It's really not hard. You could probably learn enough to get started in a couple of hours, and it's easy to expand your knowledge as and when you need to once you've got the basics sorted.
That's after you set up users, schemas, tables, columns, oddly-named datatypes with unexpected behaviors, etc.
The RDBMS is a lie. It's a beautiful kernel of relational theory wrapped in a 60-foot ball of hacks, tweaks, duct-tape, bubble-gum, and hate. The document store doesn't lie. It doesn't pretend. It's honest that it's stupid and it's a glorified hashtable with a string in it. It doesn't make any ridiculous pretenses of having a Sufficiently Smart Query Optimizer that will inevitably let you down and leave you pulling your hair out trying to figure out why on earth a simple, straightforward query is running so goddamned slow.
Then you build a complicated model and have to figure out from the query plan why your query is slow and deal with indices and foreign keys and all that nonsense.
Meanwhile an object DB may be inefficent and clumsy, but it gets all of that stuff out of the way. Also, if you don't want to join, you can work around that by duplicating the data all over the place. Something you can't do with an RDBMS because tables are fundamentally flat and so you can't stuff a parent-child relationship into a single table.
> That's after you set up users, schemas, tables, columns
You have software engineer you similarly should have a data engineer (i.e. a DBA). Nearly 50 years has passed and we still didn't find a better way to represent data, so perhaps it is the right model. The only difficult part is to bother enough to learn it.
> oddly-named datatypes with unexpected behaviors [...] 60-foot ball of hacks, tweaks, duct-tape, bubble-gum, and hate
That's only when using MySQL
> Meanwhile an object DB may be inefficent and clumsy, but it gets all of that stuff out of the way. Also, if you don't want to join, you can work around that by duplicating the data all over the place. Something you can't do with an RDBMS because tables are fundamentally flat and so you can't stuff a parent-child relationship into a single table.
You absolutely can store data inefficiently in RDBMS for example in Postgres you can create a table with two columns one named key, and another data. The data type for the second column can be JSONB. End you're essentially have equivalent of Mongo's collection.
But in that case you store data inefficiently and if your application starts evolving and you need to make different queries things will get more complex quickly.
> "That's after you set up users, schemas, tables, columns, oddly-named datatypes with unexpected behaviors, etc."
You shouldn't jump in at the deep end when learning this stuff, getting a feel for how an existing database works before creating your own is helpful, and you can download sample databases to play around with when you start learning if you don't have access to one already. Aside from this recommendation, if you're starting out learning about RDBMS, then SQlite is a good starting point, and the setup of a SQlite database is fairly simple.
If you're starting from scratch with a brand new database, here are some of the SQL keywords you'll find useful:
CREATE TABLE
ALTER TABLE
DELETE TABLE
TRUNCATE TABLE
INSERT INTO
UPDATE
SET
DELETE FROM
VARCHAR
INT
DECIMAL
PRIMARY KEY
FOREIGN KEY
REFERENCES
Furthermore, if you're using a decent DB GUI frontend, you don't even need to remember most of the above, aside from VARCHAR (for strings), INT (for whole numbers) and DECIMAL (for numbers with a fractional element). Reason being, you can do all of the database setup graphically. Tools like SQL Server Management Studio help in reducing friction.
I am far from being a database expert, I just know enough basics to query what I need, so feel free to correct/complete the following:
- Mongo has been built to store json objects -> Yes, but from what I understand benchmarks indicate that PostgreSQL is faster at reading/storing/indexing json/jsonb content. I don't think that it is good reason to use it.
- Mongo is schemaless -> There might be some usecases, but I bet in most cases this problem can be worked around. Especially in a database with JSONB support.
- MongoDB horizontal scaling is way easier than PostgreSQL. Yes, it seems that scaling horizontally Mongo is extremely easy compared to any other relational database.
And ... that's it. But there is probably more.
At the moment, here's how I would summarize MongoDB benefits if asked my opinion when starting a project:
- For a small projects or a prototype: ease of use, ease of configuration, don't require too much thinking into my data model while I am experimenting
- For a bigger project: horizontal scaling should be easier
Does that sound accurate to you? Am I missing anything important?