Efficient data structure to implement fake file system

data structuresfile-systemstrie

I want to implement a data structure that will hold the paths of directories, sort of fake file system.

Input:- I have a text configuration file containing the paths as follows

C:/temp1
C:/temp1/insideTemp1
C:/temp2/

I might end up storing huge amount of paths inside the data structure and I am looking for extremely low retrial time.

Here are the data structures that I think might be useful in this situation :-
B tree
B+ tree
Wavelet Tree
Trie
and B-Trie

I am not sure what data structure would be best useful in this situation.

I am not familiar with the B-trie, nor there are much resources on the internet that I can find that would help me in implementing and understanding one, but I think this might be a good choice for implementation considering it would have advantages of both b tree and trie, correct me if I am wrong.

I would like to know what would be the good data structure in implementing this problem and if you can redirect me some resource that would help me to get started that would be great!

Also, just out of curiosity, in the future if I want to expand the suggested data structure towards full-fledged file system would the suggested data structure be good enough to expand?

Every answer is appreciated.

Best Answer

Quick Short Answer

Pick a regular Tree Structure first, later look for a similar optimized version

Long Boring Extended Answer

Use a regular Tree Structure first.

I read that it seems you try to store the entire path in each node element, and that's one of the reasons your structure may requiere unnecesarily optimization.

Note: The following examples are pseudocode. These are not specific Programming Language, finished examples.

Instead of having something like these:

public struct NodeStruct
{
  public string FullPath;
  public *NodeStruct Container;
  public List<*NodeStruct> Items;
}

int main()
{
  int ErrorCode = 0;

  *NodeStruct MyFileSystem =
    new nodeStruct;
  MyFileSystem->FullPath = "FileSystem";
  MyFileSystem->Container = null;

  *NodeStruct CDrive =         
    new nodeStruct;
  CDrive->FullPath = "C:\";
  CDrive->Container = MyFileSystem;

  *NodeStruct DDrive =         
    new nodeStruct;
  DDrive->FullPath = "D:\";
  DDrive->Container = MyFileSystem;

  *NodeStruct SomeFolder = null;
  *NodeStruct SomeFile = null;

  SomeFolder =
    new nodeStruct;
  SomeFolder->FullPath = "C:\Program Files";
  SomeFolder->Container = CFolder;

  SomeFolder =
    new nodeStruct;
  SomeFolder->FullPath = "C:\My Documents";
  SomeFolder->Container = CFolder;

  SomeFile =
    new nodeStruct;
  MyFileSystem->FullPath = "C:\My Documents\Homework1.doc";
  MyFileSystem->Container = SomeFolder;

  free MyFileSystem;

  return ErrorCode;
} // int main(...)

You may want to have something like these:

public struct NodeStruct
{
  string Identifier;

  public *NodeStruct Container;
  public List<*NodeStruct> Items;
}

string GetFullPath(*NodeStruct AFileSystemItem)
{
  string ErrorCode = "";

  // code to "traverse" a tree item or tree node
  // and get the full path

  return ErrorCode;
} // string GetFullPath(...)


int main()
{
  int ErrorCode = 0;

  *NodeStruct MyFileSystem =
    new nodeStruct;
  MyFileSystem->FullPath = "FileSystem";
  MyFileSystem->Container = null;

  *NodeStruct CDrive =         
    new nodeStruct;
  CDrive->FullPath = "C:";
  CDrive->Container = MyFileSystem;

  *NodeStruct DDrive =         
    new nodeStruct;
  DDrive->FullPath = "D:";
  DDrive->Container = MyFileSystem;

  *NodeStruct SomeFolder = null;
  *NodeStruct SomeFile = null;

  SomeFolder =
    new nodeStruct;
  SomeFolder->FullPath = "Program Files";
  SomeFolder->Container = CFolder;

  SomeFolder =
    new nodeStruct;
  SomeFolder->FullPath = "My Documents";
  SomeFolder->Container = CFolder;

  SomeFile =
    new nodeStruct;
  MyFileSystem->FullPath = "Homework1.doc";
  MyFileSystem->Container = SomeFolder;

  string AFullPath = GetFullPath(SomeFile);

  // "AFullPath" should have the "C:\My Documents\Homework1.doc" value

  free MyFileSystem;

  return ErrorCode;
} // int main(...)

You didn't specify a particular Programming Language in your question. Neither if you are using an Imperative, Procedural, Functional or Object Oriented P.L., that may help suggest you a library.

Cheers.

Related Topic