SQLite Performance – Realistic Maximum Size for a SQLite Database

databaseperformancesqlite

According to this article on Appropriate Uses For SQLite it says that, while SQLite is limited to 140 terabytes, a client/server RDBMS may work better:

An SQLite database is limited in size to 140 terabytes (247 bytes, 128 tibibytes). And even if it could handle larger databases, SQLite stores the entire database in a single disk file and many filesystems limit the maximum size of files to something less than this. So if you are contemplating databases of this magnitude, you would do well to consider using a client/server database engine that spreads its content across multiple disk files, and perhaps across multiple volumes.

In general, I agree with this, but I was surprised to learn that SQLite's maximum limit was so high! In my experience I have used quite a few SQL Server databases in the size of ~30-100GB. I have also worked indirectly with much larger databases using Oracle, Postgres, or Cassandra. Of those, at least to my knowledge, none were approaching 140TB. I'm not a DBA, so this is what I would consider "large" from my direct experience.

I have only considered SQLite for situations where the database would be tiny; dozens of megabytes at most.

After reading this article I'm still not convinced to ever consider SQLite for anything that might require hundreds of gigabytes. But I am wondering if I have been underestimating its capabilities. What is a realistic max size limit for a SQLite database in real-world use?

Best Answer

The realistic limit (of the size of some Sqlite database) is the same as the realistic limit for a data file. And that limit depends a lot of your computer & system. On my current Linux desktop I cannot afford much bigger than a 350Gbyte file (because as a rule of thumb I avoid having one single file eating more than half a disk partition). BTW, that practical limit also impacts other SQL RDBMS like PostGreSQL or MariaDB (but most of these are keeping data in several files, which you might keep on different file systems, and some of them are able to manage distributed data on remote machines...)

After reading this article I'm still not convinced to ever consider SQLite for anything that might require hundreds of gigabytes

You are right and wrong.

You are right, because on today's computer (laptops & desktops, not supercomputers or datacenter servers), a hundred gigabyte is still a quite large disk space. So in practice, if you think of such a large database, you'll better imagine a real SQL server (à la PostGreSQL) in particular because you might want very probably remote access, effectively concurrent access and probably distributed data & tables.

You are (in principle, I never tried) wrong because very probably SQLite is capable (and sometimes tested) to deal with a database of several hundred gigabytes, assuming you have a filesystem capable of dealing with such a large file (and probably two of them at least).

I certainly would (sometimes) consider SQLite for databases of several dozens of gigabytes (and I did try once such a large .sqlite file, IIRC of 40Gbytes). On current (non-supercomputer) machines, I would hesitate having many hundred of gigabytes of SQLite database, simply because such a file is quite big by today's practice.

IIRC some hardware vendor selling specialized filesystems machines did spoke me once of a terabyte sqlite application (but I could be wrong).

Of course SQLite performance depends (like all SQL databases) a lot of the number and width of tables, their indexes, the SQL queries involved. And you don't want to have simultaneous access (by many different processes), and you should use transaction (by experience, even on an tiny SQLITE database of a few megabytes, you really want to wrap your e.g. thousand of insertion requests with BEGIN TRANSACTION & END TRANSACTION, not doing that is slowing down Sqlite by a large factor -more than 10x-).

And by personal experience, with suitable configuration and organization, SQLite is able to manage a database bigger than available RAM (so 30Gbytes is not a problem) - but you probably want the indexes to fit in RAM!

If you happen to code something for a "supercomputer" or a costly workstation (with e.g. 512Gbytes of RAM and 8Tbytes of disk and 512Gbyte of SSD) you certainly can have a terabyte Sqlite database. But you'll want to do that perhaps only if one (or very few) process(es) is accessing that database. If you have a dozen of processes accessing concurrently the same database, better install a real SQL RDBMS (à la MariaDB or PostGreSQL)

Also note that while the (binary) format of .sqlite database files is documented as being "portable", I much prefer to backup databases in SQL textual format (using sqlite3 mydb.sqlite .dump > mydb.sql). Then, I also need some additional disk space for that textual dump (and that lowers the realistic limit).

Usually Sqlite is not the bottleneck. But the disk might be.

PS. The same reasoning could be applied to large indexed files using GDBM.

PPS. In my expjs branch (sept.2016) of my MELT monitor (GPLv3 free software, on github) I am persisting the entire application heap in JSON inside a fresh Sqlite database. I've run tiny experiments with several millions objects (quite "large") without bad surprises. YMMV.

Related Topic