Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, actually it is very simple when you understand the why SQLite performs differently than other database systems. The major difference is that 'normal' database systems have a process which receives and processes SQL queries and has the exclusive access to the database file.

SQlite on the other hand doesn't have such a process so, every client has to open the database file itself, process the SQL and close/write the file again.

So the biggest difference is, that if you want to use SQLite, you will have to limit your database operation to one or fewer ;-) processes to not get into a situation where one process has to wait until the other finished writing the database file to the filesystem just to read it again in the next moment. Otherwise your filesystem and the continuing reading and writing of the database file will make the whole thing very slow.

But as long as you have only one process using the SQLite file and keep it open for a while you will see a comparable performance to other database systems. Actually, SQLite can be very fast if you do use it in a sane way.



As far as I understand, with WAL, SQLite supports multiple readers plus one writer on a database. And multiple concurrent readers were always supported.

So the limiting factor is not the degree of concurrency, but the amount of write-activity. Workloads that mostly read and only occasionally write to the database work well with SQLite.


SQLite is fine with multiple processes as long as they are read mostly. And if you switch journal to WAL mode, it is as good any multiprocess DB without MVCC (which, depending on access patterns might be as good as it gets - although mist patterns do benefit immensely from MVCC)


I'm not sure what you mean by "multi-process DB", but there are plenty of workloads in which SQLite under-performs non-MVCC databases. It is simply not designed for highly concurrent access patterns. In a previous job, I had to replace the persistence layer for a medium-load (100-500 events/sec) event tracking system from SQLite to SQL Server which is, by default, non-MVCC. The latter was considerably more complicated and considerably faster and more scalable than the former.

I say this not because SQLite is not good. It's _great_ for what it intends to do which is, as the linked page says, offer an excellent alternative to fopen(). It doesn't really offer an alternative to a full DBMS, though. There may be some gray area between a "heavy fopen()" workload and a "light DBMS" workload where there might be a legitimate case to be made for SQLite over a DBMS, but standard OLTP or OLAP workloads with any degree of concurrency are almost uncertainly outside of SQLite's intended use case.

There was a recent conversation on a similar topic with input from Richard Hipp, SQLite's author, here: https://news.ycombinator.com/item?id=15303662


Not really.

In a usual quassel (irc bouncer that can, optionally, store logs in SQLite) setup, you’ll have between 1 and 10 writers, constantly writing hundreds of IRC messages a second to SQLite, while you’ll also have readers that need to respond with low latency for queries loading hundredthousands of messages, including those that were just written.

The result is that SQLite gets so slow that the writers queue up so much work that the IRC connection times out.

SQLite is absolutely useless for multiple writers.


> SQLite is absolutely useless for multiple writers.

I don't really quite agree with this, but would say that with SQLite it's more important be be aware of how concurrency is being used with regards to the database, in comparison to something like MySQL or PostgreSQL. As far as IRC logs, I think the important thing (that would almost certainly hamstring any SQL database that's actually being ACID) is to not use autocommit on inserts, and to not use a transaction for every single insert (which is possibly what's happening). I wrote a plugin for ZNC to do exactly what you're doing above, and by batching my writes (either waiting 15 seconds or until n (where n is maybe 1000 or more) messages are queued), I've gotten superb performance out of SQLite3 running with a WAL journal, with the trade off being a little bit of data might be lost if your bouncer happens to fall down in an unhandled way. I was also able to get really good read performance without hurting raw write performance by leveraging partial indices, which something like MySQL doesn't give you.


Well, with PostgreSQL everything works just fine. Read and write performance is amazing, and full text search returns instantaneous results.

With SQLite, it’s basically useless.


It really depends... I've worked on systems that PostgreSQL (or any SQL rdbms) wouldn't be able to keep up with. I've seen many others that would do fine with SQLite. It depends.

In the end, it's a matter of testing. Also, with faster drive options (Optane and whatever is next), it could very well be a decent option. It all depends.

Frankly, for logging, I'd lean towards elasticsearch (ELK) these days, which is about the most write heavy situation you're likely to see. Depending on the scaling Mongo, RethinkDB, ElasticSearch and Cassandra can all be better solutions than RDBMS. I'd still reach for SQLite or PostgreSQL first depending on the situation though.


Sure, with optane it may work — but the usual usecase for SQLite is on the slowest SD cards or spinning HDDs you can imagine, where the user didn't want to spend time configuring postgres


The key thing you omitted is you can have multiple readers at once. As long as you don't have really slow reads or writes and don't have a tonne of writes trying to queue up, nobody should get starved with database level locking.


You’re right, SQLite can be very fast if you know what you’re doing. Expensify was able to achieve 4M queries per second.

https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: