As with most database systems, the database files does not shrink when you delete data, the data is just removed/marked as deleted, and the space is reused.
You'll need to run db.repairDatabase()
to compact space as noted here
Note: do not forget that mongodb has a limit to the document size, i think it is 16mb, but you'll need to check the documentation
The entire point of mongodb is to store your data denormalized, and to avoid 'joins', but if your data is entirely separate, it should still be stored in separate collections.
On our site, we have a few different collections, and one of them is linked via reference. It depends which driver you are using if the driver will resolve the references for you or not.
The other thing to consider is how you will be updating the data.
MongoDB mmap's the entire collection in to memory, and allows your OS to determine which parts should be paged out to disk, and which parts of the data should be stored in memory. There shouldn't be much difference in performance between 1 large table and 2 medium sized tables if the size of the entire dataset is the same. A consideration here is indexes, if you combine the data into one collection, and an index can cover it all, you may be able to lookup the data more quickly.
So, you could have a collection with each of your hotels, containing a property called 'customers' which is an array or hashes with the details of each customer, and you can push and pull items off of that array (or you can make it a hash on a unique customer identifier for easier access). Don't forget about the 16mb limit though.
It might be easier to answer your question if I had more context and details about what it is you are trying to store, and what kind of queries you need to run against the data
Best Answer
Compaction-level can be determined by comparing the datasize in a collection through db.stats.
dataSize
gives you how much data is in the collection, wherestorageSize
tells you how big the files are. dataSize <= storageSize, but how big the difference is should tell you how much gain you'll get through compaction.Mongo doesn't allow objects to not be entirely co-located, so you won't get cases where an object is scattered across the datafiles. Where this comes into play is if an object expands past it's free-allocation, the entire object has to be rewritten somewhere bigger.
When I was playing with Mongo databases, a compaction in a quarterly maintenance window was all we needed. But then, our dataset didn't have a whole lot of deletions, so we weren't creating voids that often. To figure out your rate, track those two dbStats values and see how they move over time.