Recurring Events Database Model

databasedatabase-designmodelingrecurring-events

I've being searching for a solution for recurring events, so far I've found two approaches:

First approach:

Create an instance for each event, so if the user has a daily event for one year, it would be necessary 365 rows in the table.
It sounds plausible for a fixed time frame, but how to deal with events that has no end date?

Second approach:

Create a Reccuring pattern table that creates future events on runtime using some kind of Temporal expression (Martin Fowler).

Is there any reason to not choose the first approach instead of the second one?
The first approach is going to overpopulate the database and maybe affect performance, right?!

There's a quote about the approach number 1 that says:

"Storing recurring events as individual rows is a recipe for disaster." (https://github.com/bmoeskau/Extensible/blob/master/recurrence-overview.md)

What do you guys think about it? I would like some insights on why that would be a disaster.

I appreaciate your help

Best Answer

The proper answer is really both, and not either or.

Setting aside for a moment the issue of no end date for recurrence: what you want is a header that contains recurrence rules for the whole pattern. That way if you need to change the pattern, you've captured that pattern in a single record that can be edited without risking update anomalies.

Now, joining against some kind of recurrence pattern in SQL is going to be a great big pain in the neck. Furthermore, what if your rules allow you to tweak (edit, or even delete) specific instances of this recurrence pattern?

How do you handle this? You have to create an instance table with one row per recurring instance with a link (foreign key) back to the single rule that was used to create it. This let's you modify an individual child without losing sight of where it came from in case you need to edit (or delete) the entire pattern.

Consider a calendaring tool like Outlook or Google Calendar. These applications use this approach. You can move or edit an instance. You can also change the whole series. The apps ask you which you mean to do whenever you go into an editing mode.

There are some limitations to this. For example, if you edit an instance and then edit the pattern, you need to have a rule that says either (a) new parent wins or (b) modified children always win. I think Outlook and Google Calendar use approach (a).

As for why having each instance recorded explicitly, the only disastrous thing I can think of would be that if you didn't have the link back to the original recurrence pattern you would have a heck of a time cancelling the whole series in one action.

Back to no end date - This might be a case of discretion being the better part of valour and using some kind of rule of thumb that imposes a practical limit on how far into the future you extend such a series - or alternatively you could just not allow that kind of rule in a pattern. Force an end to the pattern and let the rule's creator worry about extending it at whatever future point it becomes necessary.

Related Solutions

Sql – What are the options for storing hierarchical data in a relational database

My favorite answer is as what the first sentence in this thread suggested. Use an Adjacency List to maintain the hierarchy and use Nested Sets to query the hierarchy.

The problem up until now has been that the coversion method from an Adjacecy List to Nested Sets has been frightfully slow because most people use the extreme RBAR method known as a "Push Stack" to do the conversion and has been considered to be way to expensive to reach the Nirvana of the simplicity of maintenance by the Adjacency List and the awesome performance of Nested Sets. As a result, most people end up having to settle for one or the other especially if there are more than, say, a lousy 100,000 nodes or so. Using the push stack method can take a whole day to do the conversion on what MLM'ers would consider to be a small million node hierarchy.

I thought I'd give Celko a bit of competition by coming up with a method to convert an Adjacency List to Nested sets at speeds that just seem impossible. Here's the performance of the push stack method on my i5 laptop.

Duration for     1,000 Nodes = 00:00:00:870 
Duration for    10,000 Nodes = 00:01:01:783 (70 times slower instead of just 10)
Duration for   100,000 Nodes = 00:49:59:730 (3,446 times slower instead of just 100) 
Duration for 1,000,000 Nodes = 'Didn't even try this'

And here's the duration for the new method (with the push stack method in parenthesis).

Duration for     1,000 Nodes = 00:00:00:053 (compared to 00:00:00:870)
Duration for    10,000 Nodes = 00:00:00:323 (compared to 00:01:01:783)
Duration for   100,000 Nodes = 00:00:03:867 (compared to 00:49:59:730)
Duration for 1,000,000 Nodes = 00:00:54:283 (compared to something like 2 days!!!)

Yes, that's correct. 1 million nodes converted in less than a minute and 100,000 nodes in under 4 seconds.

You can read about the new method and get a copy of the code at the following URL. http://www.sqlservercentral.com/articles/Hierarchy/94040/

I also developed a "pre-aggregated" hierarchy using similar methods. MLM'ers and people making bills of materials will be particularly interested in this article. http://www.sqlservercentral.com/articles/T-SQL/94570/

If you do stop by to take a look at either article, jump into the "Join the discussion" link and let me know what you think.

Calendar Recurring/Repeating Events – Best Storage Method

Storing "Simple" Repeating Patterns

For my PHP/MySQL based calendar, I wanted to store repeating/recurring event information as efficiently as possibly. I didn't want to have a large number of rows, and I wanted to easily lookup all events that would take place on a specific date.

The method below is great at storing repeating information that occurs at regular intervals, such as every day, every n days, every week, every month every year, etc etc. This includes every Tuesday and Thursday type patterns as well, because they are stored separately as every week starting on a Tuesday and every week starting on a Thursday.

Assuming I have two tables, one called events like this:

ID    NAME
1     Sample Event
2     Another Event

And a table called events_meta like this:

ID    event_id      meta_key           meta_value
1     1             repeat_start       1299132000
2     1             repeat_interval_1  432000

With repeat_start being a date with no time as a unix timestamp, and repeat_interval an amount in seconds between intervals (432000 is 5 days).

repeat_interval_1 goes with repeat_start of the ID 1. So if I have an event that repeats every Tuesday and every Thursday, the repeat_interval would be 604800 (7 days), and there would be 2 repeat_starts and 2 repeat_intervals. The table would look like this:

ID    event_id      meta_key           meta_value
1     1             repeat_start       1298959200 -- This is for the Tuesday repeat
2     1             repeat_interval_1  604800
3     1             repeat_start       1299132000 -- This is for the Thursday repeat
4     1             repeat_interval_3  604800
5     2             repeat_start       1299132000
6     2             repeat_interval_5  1          -- Using 1 as a value gives us an event that only happens once

Then, if you have a calendar that loops through every day, grabbing the events for the day it's at, the query would look like this:

SELECT EV.*
FROM `events` EV
RIGHT JOIN `events_meta` EM1 ON EM1.`event_id` = EV.`id`
RIGHT JOIN `events_meta` EM2 ON EM2.`meta_key` = CONCAT( 'repeat_interval_', EM1.`id` )
WHERE EM1.meta_key = 'repeat_start'
    AND (
        ( CASE ( 1299132000 - EM1.`meta_value` )
            WHEN 0
              THEN 1
            ELSE ( 1299132000 - EM1.`meta_value` )
          END
        ) / EM2.`meta_value`
    ) = 1
LIMIT 0 , 30

Replacing {current_timestamp} with the unix timestamp for the current date (Minus the time, so the hour, minute and second values would be set to 0).

Hopefully this will help somebody else too!

Storing "Complex" Repeating Patterns

This method is better suited for storing complex patterns such as

Event A repeats every month on the 3rd of the month starting on March 3, 2011

Event A repeats Friday of the 2nd week of the month starting on March 11, 2011

I'd recommend combining this with the above system for the most flexibility. The tables for this should like like:

ID    NAME
1     Sample Event
2     Another Event

And a table called events_meta like this:

ID    event_id      meta_key           meta_value
1     1             repeat_start       1299132000 -- March 3rd, 2011
2     1             repeat_year_1      *
3     1             repeat_month_1     *
4     1             repeat_week_im_1   2
5     1             repeat_weekday_1   6

repeat_week_im represents the week of the current month, which could be between 1 and 5 potentially. repeat_weekday in the day of the week, 1-7.

Now assuming you are looping through the days/weeks to create a month view in your calendar, you could compose a query like this:

SELECT EV . *
FROM `events` AS EV
JOIN `events_meta` EM1 ON EM1.event_id = EV.id
AND EM1.meta_key = 'repeat_start'
LEFT JOIN `events_meta` EM2 ON EM2.meta_key = CONCAT( 'repeat_year_', EM1.id )
LEFT JOIN `events_meta` EM3 ON EM3.meta_key = CONCAT( 'repeat_month_', EM1.id )
LEFT JOIN `events_meta` EM4 ON EM4.meta_key = CONCAT( 'repeat_week_im_', EM1.id )
LEFT JOIN `events_meta` EM5 ON EM5.meta_key = CONCAT( 'repeat_weekday_', EM1.id )
WHERE (
  EM2.meta_value =2011
  OR EM2.meta_value = '*'
)
AND (
  EM3.meta_value =4
  OR EM3.meta_value = '*'
)
AND (
  EM4.meta_value =2
  OR EM4.meta_value = '*'
)
AND (
  EM5.meta_value =6
  OR EM5.meta_value = '*'
)
AND EM1.meta_value >= {current_timestamp}
LIMIT 0 , 30

This combined with the above method could be combined to cover most repeating/recurring event patterns. If I've missed anything please leave a comment.