PHP Data Structures – Adding Ordered Nodes to Tree in Arbitrary Order

data structuresPHPtrees

I have a database of nodes. Each node can have exactly one parent, but any number of children. Some nodes may be stored with no parent, but at run time, I can create a default "root" node to be the parent of these.

First off, what is this data structure called? I think it's just a "tree," but I want to make sure I'm using the correct terminology.

Another issue I'm running into is finding an efficient way to create the tree. The nodes can be retrieved from the database in any order, so when I add a node to the tree, its parent may not be in the tree yet. This means that when I add a node, I have to check whether any existing nodes have parents and then add the parent in the appropriate spot.

I've searched a lot online, but most talk about trees just refers to "binary trees," and even then I can't even find any good examples of constructing the tree in code — it's usually working with an existing tree.

Are there any examples of efficient ways to build this tree (whatever it may be called)? Language doesn't matter, but I'll be using php.

Best Answer

With the caveat that each node has exactly one parent except for the root node, calling this a tree is fine.

With respect to building the tree, one way is to do the following:

  1. keep some kind of associative lookup of existing nodes indexed by their unique ID (I don't know PHP, but I'm thinking some kind of dictionary or hash)
  2. keep a similar lookup of orphan nodes, indexed by parent ID
  3. when you create a node:
    1. see if it's parent is in the existing set
      • if it is, attach the child to the parent
      • if not, insert the new node in the orphan set
    2. then see if any orphans are waiting for it (nodes in the orphan set indexed with the new node's ID)
      • if there are, attach them and remove them from the orphan set
    3. finally, add the new node to the existing set
  4. at the end, anything still in the orphan set is either a problem, or should be attached to your root node.

Assuming, of course, you have some unique ID to identify parent/child relationships.

Oh, and I seem to have used the word set to mean some kind of associative container - note that it can't be unique or 3.2 will have problems with multiple orphans of the same node.

Related Topic