MongoDB indexing before adding data

mongodb

A lot of people are saying do "background indexing" after importing a dump. But what if you don't have a dump? For example, we're spinning up a Docker container with MongoDB on it, and installing it on different servers. And the data is going to be added piecemeal. We're talking one record at a time. Would it be a bad thing to do an ensureIndex at the time the database is created? Or create a trigger…? I'm not seeing any answers to this question. Everybody's saying "do it after loading data." I won't get the opportunity after loading data, because as I said only one record at a time goes in. And if they user tries to then do a search on that one record, by field name, they're gonna wait forever because it's not indexed. Opinions on what to do? (Please respond politely.)

Best Answer

My general advice would be to add an index before you need to use it extensively. A query without any supporting indexes will result in a collection scan, which will become more time consuming and impactful as the data in that collection grows.

It's fine to create an index before any data is inserted (irrespective of whether that will be followed by individual inserts or bulk import). It may be somewhat faster to insert with fewer indexes, but if you are going to require these indexes at the end of the import there may not be significant overall time savings.

Background indexing is recommendable if you have existing data and want to create a new index. As at MongoDB 3.4, a foreground index build is a blocking operation for the database holding that collection and is not advisable for a production environment unless that database is not in use. A background index build will take longer than a foreground index build, but does not block reads or writes.

Note: the background index option only affects the initial index build. Once an index has been created, any subsequent updates are always performed as part of document updates.

Related Topic