Should I use multiple column primary keys or add a new column

database-design

My current database design makes use of a multiple column primary key to use existing data (that would be unique anyway) instead of creating an additional column assigning each entry an arbitrary key. I know that this is allowed, but was wondering if this is a practice that I might want to use cautiously and possibly avoid (much like goto in C).

So what are some of the disadvantages I might see in this approach or reasons I might want a single column key?

Best Answer

Usually when you have a table with a multi-column primary key, it's the result of a join table (many-to-many) that became elevated to be its own entity (and thus deserves it's own primary key). There are many who would argue that any join table SHOULD be an entity by default, but that's a discussion for another day.

Let's look at a hypothetical many to many relationship:

Student * --- * Class

(a Student can be in multiple classes, a Class can have multiple students).

In between those two tables will be a junction table called StudentClass (or ClassStudent depending how you write it). Sometimes, you want to keep track of stuff like when the student was in the class. So you'll add it to the StudentClass table. At this point, StudentClass has become a unique entity...and should be given a name to recognize it as such e.g. Enrollment.

Student 1 --- * Enrollment * --- 1 Class

(a student can have many Enrollments, each Enrollment is for one class (or going the opposite way a Class can have many Enrollments, each Enrollment is for one Student).

Now you can query things like, how many students were enrolled in the Chemistry 101 class this past year? Or what classes was the student John Doe enrolled in while attending Acme University? This was possible without the separate primary key, but once you have a primary key for enrollment an easier query would be of these enrollments (by id), how many students received a passing grade?

The determination of whether an entity deserves a PK boils down to how much querying (or manipulation) you will do for that entity. Let's say for instance, you wanted to attach the assignments completed for a student in a class. The logical place to attach this entity (Assignment) would be on the Enrollment entity. Giving enrollment it's own primary key would make the Assignment queries simpler.

Related Topic