Recovering data from MongoDB raw files

mongodb

We use mongodb for our database and set the replset(two servers), but we mistakenly deleted some raw files that under /path/to/dbdata on both servers. After that, we used extundelete to get back the deleted files. We ran the extundelete on both servers and merge the results, like database.1, database.2 etc. We could not start the mongod, it raised the following error when starting mongod or executing mongodump, here is the console output:

root@mongod:/opt/mongodb# mongodump --repair --dbpath /opt/mongodb -d database_production
Thu Aug 21 16:22:43.258 [tools] warning: repair is a work in progress
Thu Aug 21 16:22:43.258 [tools] going to try and recover data from: database_production
Thu Aug 21 16:22:43.262 [tools]   Assertion failure isOk() src/mongo/db/pdfile.h 392
0xde1b01 0xda42fd 0x8ae325 0x8ac492 0x8bd8e0 0x8c1c51 0x80e345 0x80e607 0x80e6a4 0x6db92a     0x6dc1ff 0x6e0db9 0xd9e45e 0x6ccdc7 0x7f499d856ead 0x6ccc29 
mongodump(_ZN5mongo15printStackTraceERSo+0x21) [0xde1b01]
mongodump(_ZN5mongo12verifyFailedEPKcS1_j+0xfd) [0xda42fd]
mongodump(_ZNK5mongo7Forward4nextERKNS_7DiskLocE+0x1a5) [0x8ae325]
mongodump(_ZN5mongo11BasicCursor7advanceEv+0x82) [0x8ac492]
mongodump(_ZN5mongo8Database19clearTmpCollectionsEv+0x160) [0x8bd8e0]
mongodump(_ZN5mongo14DatabaseHolder11getOrCreateERKSsS2_Rb+0x7b1) [0x8c1c51]
mongodump(_ZN5mongo6Client7Context11_finishInitEv+0x65) [0x80e345]
mongodump(_ZN5mongo6Client7ContextC1ERKSsS3_b+0x87) [0x80e607]
mongodump(_ZN5mongo6Client12WriteContextC1ERKSsS3_+0x54) [0x80e6a4]
mongodump(_ZN4Dump7_repairESs+0x3a) [0x6db92a]
mongodump(_ZN4Dump6repairEv+0x2df) [0x6dc1ff]
mongodump(_ZN4Dump3runEv+0x1b9) [0x6e0db9]
mongodump(_ZN5mongo4Tool4mainEiPPc+0x13de) [0xd9e45e]
mongodump(main+0x37) [0x6ccdc7]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7f499d856ead]
mongodump(__gxx_personality_v0+0x471) [0x6ccc29]
assertion: 0 assertion src/mongo/db/pdfile.h:392
Thu Aug 21 16:22:43.271 dbexit: 
Thu Aug 21 16:22:43.271 [tools] shutdown: going to close listening sockets...
Thu Aug 21 16:22:43.271 [tools] shutdown: going to flush diaglog...
Thu Aug 21 16:22:43.271 [tools] shutdown: going to close sockets...
Thu Aug 21 16:22:43.272 [tools] shutdown: waiting for fs preallocator...
Thu Aug 21 16:22:43.272 [tools] shutdown: closing all files...
Thu Aug 21 16:22:43.273 [tools] closeAllFiles() finished
Thu Aug 21 16:22:43.273 [tools] shutdown: removing fs lock...
Thu Aug 21 16:22:43.273 dbexit: really exiting now

My env:

  1. Debian 3.2.35-2 x86_64(it's a XEN virtual machine)
  2. mongodb 2.4.6

and we did not delete the .0 and .ns files.

We tried to create a new database with the same name and copy these db.ns and db.2, db.3 to the new db, we still met the same error.

Is there any way to check the valid of raw .ns and datafiles, and how to recover the database?

Best Answer

As Stennie correctly pointed out, your data is almost certainly lost.

The reason for this is that MongoDB replication is no binary replication of the data files. Instead, optimized statements are saved in a special collection called "oplog", which the secondaries connect to with a tailable cursor. When MongoDB has optimized a query, it is written to said oplog and the secondaries connected can read the resulting query or queries.

Hence, depending on the state the datafiles of the replica set were in when you initialized replication, the same document might reside in datafile 1 on the primary while it resides in datafile 42 on one secondary and datafile 3 on another. Bottom line is that the position of a document in the datafiles isn't predictable*, much less to be guaranteed.

I hate to say it, but save for extremely expensive data forensics, your data is irrecoverable.

*Albeit it is describable as "The first continuous free range of bytes able to hold the binary representation of the document's data and padding."

Related Topic