Database Design – Is Self-Referencing Tables a Good Idea for Geographical Data?

databasedatabase-designdesignobject-oriented-design

I am creating a design for different Geographical types where each type is just a little "place holder" in over all heirarchy.

One example is

Zip code Belongs to Territory Belongs to District

Other would be

Zip code belongs to State belongs to Some Region

We have two schools of thought one is that:

  1  create different tables like Zipcode, Territory, Regions etc 
  1. other is that just create one table which is called Geographical
    Entity and then everything is just a type so you can assign a Type Id
    to each geographical entity and self reference the table

Now, if I go by design #1 then I don't have to worry about any non traditional decision. I can create one class in OO space and then map to each table. Also, imposing business level rules such as Territory can do such and such things but Distrtict can not. And also things like a Territory can belong to District but District can not belong to Territory are all contained in design itself.

However, If I go to Design #2. It is extensible such that if we have a new Geographical type called "XYZ" then we do not need to add a new table for that. We can simply add a new type into Geography type and it would be addressed.

I find Design #2 extensible but unnecessarily complex. Am I right thinking it that way? Or that's the way to go?

Best Answer

There are at least two aspects of database design that your question touches on.

The first is the issue of class/subclass design. What is the best design for classes and subclasses when designing a relational database? (This issue is also known as types and subtypes or generalization and specialization).

The second is the issue of representing a hierarchy in relational data. What is the best design to make insertions quick and easy? What is the best design to make queries quick and easy, especially searching the subtree.

These two sound like the same issue, but they are not. There are situations where subclasses arise, but there is no hierarchy (except for the inheritance hierarchy). There are other situations that arise where there is a hierarchy, but all the participants are of one class, as far as the designer is concerned.

With regard to classes and subclasses, I recommend you look into Martin Fowler's treatment of "table inheritance", particularly "single table inheritance" and "class table inheritance". There are benefits and drawbacks to each choice.

With regard to representing a hierarchy, I recommend you look into "adjacency list model" and "nested set model". Adjacency list is so simple that most newbies learn the technique without ever learning its name. Nested sets is less well known, and is only useful under some special circumstances. It does, however, permit traversing a subtree without using recursion.

I'm not going to go into more detail here, because it's not clear whether you are interested in both of these issues or just one of them.

Related Topic