Data Structures – History of Hierarchical Filesystem

data structureslanguage-agnosticoperating systems

I'm self-taught and I don't have a CS degree. The more I've been learning about data structure, the more I wonder, in this day and age, how are we still saddled with the filesystem, with directories and files, as the basic data storage structure on the OS?

I understand the simplicity of it, but it seems nowadays that there could be more options available natively. As far as I'm aware, the only project to improve the basic functionality of the filesystem was ReiserFS, where you could tell what line of a file was changed by whom, and when.

For instance, if I could have native tagging for files, where I could tag images, diagrams, word-processing documents, an entire code repository, all as belonging to a single project, that would really be helpful to me. Since I'm stuck in the filesystem paradigm, I know that I could put all those into a single folder/directory, but what if they already exist in disparate directories, and they need to stay there? I know there are programs out there that can do this, but why aren't they on the filesystem?

Something that would be nice to have is some kind of relational feature in the filesystem, like you get with RDBMSes. I understand that that was supposed to be part of Vista/7, but that fell off the feature list too.

Sure, any program can store a binary file and have any data structure it wants in it, by why couldn't the OS offer more complex ways of storing data, beyond the simple heirarchy of the filesystem?

Best Answer

Start with this: http://en.wikipedia.org/wiki/Unix_File_System

Read this: http://www.unix.org/what_is_unix/history_timeline.html

Then read this: http://www.amazon.com/UNIX-Filesystems-Evolution-Design-Implementation/dp/0471164836

There's a simple answer to "why couldn't the OS offer more complex ways of storing data, beyond the simple heirarchy of the filesystem?"

Because it's too much for the OS to do.

That's what libraries and application packages are for.

Oracle, for example, will sell you a file-system-like set of features that you manage with the Oracle toolset.

Python uses the DBM library to create very sophisticated on-disk storage structures.

CouchDB and Mongo (and others) are very sophisticated storage structures that offer some database-like features.

The point is that the OS should do the minimum and everything is an add-on.

Related Topic