Java – Should I use composite primary keys or not

database-designjavalanguage-agnosticsql

There seems to only be 2nd class support for composite database keys in Java's JPA (via EmbeddedId or IdClass annotations). And when I read up on composite keys, regardless of language, people keep coming across as them being a bad thing. But I cannot understand why. Are composite keys still acceptable to use these days? If not, why not?

I've found one person who agrees with me:
http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx

But another who doesn't:
http://weblogs.java.net/blog/bleonard/archive/2006/11/using_composite.html

Is it just me, or are people not able to make the distinction of where a composite key is appropriate or not? I see composite primary keys useful when the table doesn't represent an entity – i.e. when it represents a join table.

A simple example:


Actor { Id, Name, Email }
Movie { Id, Name, Year }
Character { Id, Name }
Role { Actor, Movie, Character }

Here Actor, Movie and Character obviously benefit from having an Id column as the primary key.

But Role is a Many-To-Many join table. I see no point in creating an id just to identify a row in the database. To me it seems obvious that the primary key is { Actor, Movie, Character }. It also seems like a rather limiting feature, especially if the data in the join table changes all the time, you could find yourself with primary key collisions once the primary key sequence wraps around to 0.

So, back to the original question, is it still acceptable practice to use composite primary keys? If not, why not?

Best Answer

In my personal opinion you should avoid composite primary keys due to several reasons:

  1. Future changes: when you design a database you sometimes miss what in the future will become important. A significant example for this is thinking a combination of two or more fields is unique (and thus can become a primary key), whereas in the future you want to allow NULLs or other non-unique values in them. Having a single primary key is a good solid solution against such changes.

  2. Uniformity: If every table has a unique numerical ID, and you also maintain some standard as to its name (e.g. "ID" or "tablename_id"), the code and SQL referring to it is clearer (in my opinion).

There are other reasons, but these are just a few.

The main question I would ask is why not use a separate primary key if you have a unique set of fields? What's the cost? An additional integer index? That's not too bad.

Hope that helps.