Mongodb local database files size/amount explodes

disk-space-utilizationmongodb

I have three replicasets for my shard, but I noticed that the local database on each replicaset-node has produced a lot of local.* datafiles. The disk where mongodb is storing the datafiles is ~50G in size, but the local.* datafiles take up to 22G. The actual data don't need one gb at all (for now). I read the Excessive Disk Space article, where it states that the local-db should only take up to 5% of disk space.

I still don't know if I should set the --oplogSize, the -fallocate or the --noprealloc switch, and how it affect the other replicated databases.

# ll mongo -h
total 23G
-rw------- 1 mongod mongod  64M Mar 26 10:20 test.0
-rw------- 1 mongod mongod 128M Mar  1 14:03 test.1
-rw------- 1 mongod mongod  16M Mar 26 10:19 test.ns
-rw------- 1 mongod mongod  64M Mar 26 10:20 production.0
-rw------- 1 mongod mongod 128M Feb 29 18:28 production.1
-rw------- 1 mongod mongod  16M Mar 23 17:39 production.ns
drwxr-xr-x 2 mongod mongod 4.0K Feb 29 18:28 journal
-rw------- 1 mongod mongod  64M Feb 29 18:01 local.0
-rw------- 1 mongod mongod 128M Feb 29 18:00 local.1
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.10
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.11
-rw------- 1 mongod mongod 2.0G Mar 26 10:20 local.12
-rw------- 1 mongod mongod 2.0G Mar 26 10:20 local.2
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.3
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.4
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.5
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.6
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.7
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.8
-rw------- 1 mongod mongod 2.0G Feb 29 18:00 local.9
-rw------- 1 mongod mongod  16M Mar 26 10:19 local.ns
-rwxr-xr-x 1 mongod mongod    6 Feb 29 18:01 mongod.lock
drwxr-xr-x 2 mongod mongod 4.0K Mar  1 14:03 _tmp

I'm using mongodb 2.0.4 on centos 6.2/64bit.

Update: When I query the collection oplog.rs in the local db I get this:

PRIMARY> db.oplog.rs.count();
130234

Best Answer

Ah, I looked at the wrong row in df. / is 50G, but the data is stored in another mounted volume where we have 1.8 T, so the 5% rule makes sense now (meaning it could still grow up to ~90G). So I will use the --oplogSize parameter, since this size is a bit overkill for our usecase.

(However, it seems that the --oplogSize can't change the size of existing logs, I have to check if I can delete those files first).

Related Topic