MongoDB – File System Implementation with GridFS

file-systemsmongo

I am working on two projects that will both implement a Webdav server backed by a MongoDB GridFS. In each case, there is the potential for the system to store tens of millions of files spread across thousands of hierarchical directories.

I can come up with two different ways of storing the directory structure:

  • As a "true" hierarchical file system, with directories containing the IDs (_id) of subdirectories and regular files. The paths will be separated by slashes (/) as in a POSIX-compliant file system.

  • The path /a/b/c will be represented as a directory a containing a directory b containing a file c.

  • As a flat file system, where file names include the slashes.

  • The path /a/b/c will be stored as a single file with the name /a/b/c

What are the advantages and disadvantages of each, with respect to a "real" folder-based file system?

Best Answer

Have you looked at http://www.mongodb.org/display/DOCS/Trees+in+MongoDB ? It looks like you're between "Child Links" and "Materialized Paths". Based on the commentary it seems that your second idea is a much better fit for Mongo. Storing each subdirectory _id is a little too relational and implies linking and joins which are not things that Mongo excels at.

Related Topic