Option #2, using reference tables, is the standard way of doing it. It has been used by millions of programmers, and is known to work. It is a pattern, so anyone else looking at your stuff will immediately know what is going on. There exist libraries and tools that work on databases, saving you from lots and lots of work, that will handle it correctly. The benefits of using it are innumerable.
Is it wasteful? Yes, but only slightly. Any half-decent database will always keep such frequently joined small tables cached, so the waste is generally imperceptible.
All other options that you described are ad hoc and hacky, including MySQL's enum
, because it is not part of the SQL standard. (Other than that, what sucks with enum
is MySQL's implementation, not the idea itself. I would not mind seeing it one day as part of the standard.)
Your final option #3 with using a plain integer is especially hacky. You get the worst of all worlds: no referential integrity, no named values, no definitive knowledge within the database of what a value stands for, just arbitrary integers thrown all over the place. By this token, you might as well quit using constants in your code, and start using hard-coded values instead. circumference = radius * 6.28318530718;
. How about that?
I think you should re-examine why you find reference tables onerous. Nobody else finds them onerous, as far as I know. Could it be that it is because you are not using the right tools for the job?
Your sentence about having to "encode things and deal with integers", or having to "create elaborate programming constructs", or "creating new object oriented entities on the programming side", tells me that perhaps you may be attempting to do object-relational mapping (ORM) on the fly dispersed throughout the code of your application, or in the best case you may be trying to roll your own object-relational mapping mechanism, instead of using an existing ORM tool for the job, such as Hibernate. All these things are a breeze with Hibernate. It takes a little while to learn it, but once you have learned it, you can really focus on developing your application and forget about the nitty gritty mechanics of how to represent stuff on the database.
Finally, if you want to make your life easier when working directly with the database, there are at least two things that you can do, that I can think of right now:
Create views that join your main tables with whatever reference tables they reference, so that each row contains not only the reference ids, but also the corresponding names.
Instead of using an integer id for the reference table, use a CHAR(4) column, with 4-letter abbreviations. So, the ids of your categories would become "TEST", "DSGN", "PROG", "OTHR". (Their descriptions would remain proper English words, of course.) It will be a bit slower, but trust me, nobody will notice.
Finally, when there are only two types, most people just use a boolean column. So, that "standard/exception" column would be implemented as a boolean and it would be called "IsException".
Best Answer
You are overnormalizing (and as a database specialist, I don't say that often).
Think of how you use this data and you will see it makes the most sense to be in one table. If each address related to multiple zip codes, a second table makes sense. Since the relationship is one-to-one there is no need at all to split out the table unless you have too wide a record which is rarely the case in an address table.
Also since your configuration is not the common way that this data is stored, you will create a maintenance issue as new devs will be confused by the structure. Really, each address should be completely contained in one record.