I have a single-instance mongodb with a collection with 3 fields: _id, block_id, payload.
- Payloads are always 4096 byte binaries
- _id is an ever-incremented unique integer
There is a secondary index on the collection:
{ "v" : 1, "key" : { "block_id" : 1, "_id" : -1 },
"ns" : "testdb.testdev", "name" : "_block_id_id" }
I'm doing many queries like:
query: { query: { block_id: 868413 }, orderby: { _id: -1 } } ntoreturn:1 nscanned:1 nreturned:1 reslen:4166 163ms
There's no other query during these. When I read sequentially by block_id, it's 10 times faster than when I query with random block_id. I have low cpu usage, low storage utilization. The collection is 2-3 times bigger than the memory size.
What might be the bottleneck here?
Best Answer
A couple of things to clarify here:
In terms of the stats, there are two basic ways to get them. First, and quickest, is mongostat and mongotop. These two utilities ship with MongoDB and can be used to figure out what your database is up to.
The other option is MMS (the MongoDB Monitoring Service) - it's free and it lets you graph all of the relevant stats over time, so you can determine what is spiking/dipping when you see slowness. I recommend installing munin-node if you go down this route (see MMS docs) because it will give you a view of IO stats as well as MongoDB stats.
You will usually be looking for one of the following:
There are plenty of other things to look at, but that is a good start given your description. Remember, if you have memory contention, the newer something is the more likely it is to already be in memory. Since you are using a sequential ID I would expect the older IDs (unless recently updated or touched) to show up in the slow query log more often than new IDs (that is how the OS generally manages memory - see LRU for more).
In terms of tackling this type of performance issue, besides adding more RAM, you should look at: