Database Design – Thoughts on Denormalization via Short-Circuit Key

database-designnormalization

In the simplified database schema depicted below, I'm considering denormalizing by adding a short-circuit key (EventId, relation shown in red) on Appointment.

enter image description here

One advantage is not needing a join to access appointments data by event (the usual way it will be queried). An additional advantage involves constraints. I'd like to allow only one appointment per person per event. The short-circuit key makes it easy to add a unique constraint on PersonId+EventId on the Appointment table, whereas it's unclear how I'd achieve this constraint at the database level otherwise.

At the business level, timeslots are invariant once created, and appointment editing is already restricted to changing the timeslot within the same event, so I don't see much cost to keeping the data consistent here.

It also appeals to me intuitively, as I think of both the appointments and the timeslots as belonging to the event.

Thoughts on this denormalization? Does it seem reasonable?

Best Answer

In an RDBMS context, the first question is: why do you want to avoid the extra join? RDBMSes are designed to optimize such queries; do you experience any tangible performance issue? If it is for programming convenience only, you could very well create a view to automate the shortcut without real redundancy.

Looking at your design and requirements, it seems that a Timeslot is a component of an Event (i.e. in the sense of UML composition): the slot makes no sense without the event, and probably, if an event would be deleted, so would be all its slots. If this is true:

  • A trick could be to use a composite primary key for Timeslot made of EventId and a sequence number of the slot in the event.
  • The foreign key in Appointment would then be replaced by a pair EventId,Sequence
  • While such composite keys have drawbacks, and would not be my first recommendation, they have nevertheless several advantages in your scenario:
    1. it facilitates the verification of time-slot change within the same event;
    2. it provides the shortcut you are looking for, without redundancy and would hence still benefit from the automatic consistency; (this makes it in my view a better candidate than your denormalization).
    3. it encourages the DDD viewpoint that Event is the root of an aggregate that includes also Timeslot; and access to an aggregate's entities shall always go via the aggregate root.

If at this stage, you still think you need to denormalize, then evaluate the pros and cons carefully:

  • some access may be facilitated,
  • but the updates would be (slightly) more complex. So, if there are many more writes than reads, or if there is a high degree of concurrency, you might suffer from overhead costs that might exceed the hoped benefits; for example, the extra locks for propagating the denormalized change, could make the effects of denormalization less performant than you think.

But ultimately, if the pros exceed the cons, this specific shortcut, does not seem intrinsically unreasonable.

Related Topic