SQL – Algorithm for Finding Most Overlapping Events

algorithmssql

The Problem

I'm looking for a query to help me solve the following:

  • I have a series of events
  • Each event has a start and end date.
  • Many of these events overlap
  • The answer I'm looking for is the maximum number of events that overlap

Example

Say I have 5 events:

  • 1 Jan -> 9 Jan
  • 7 Jan -> 12 Jan
  • 8 Jan -> 10 Jan
  • 10 Jan -> 15 Jan
  • 12 Jan -> 17 Jan

3 of these events overlap 9th January, which is the maximum overlapping events, so the answer is 3.

(There are also 3 events overlapping on 10th January, but that's the same answer)

What I've Tried

If I was doing this in memory, I could do this:

  • For each event:
    • Get the start date
    • Count the dates that include this event
  • Pick the event with the highest count.

But there are 2 issues with this:

  • It doesn't appear to be very efficient
  • It isn't very SQL-y (ie. is procedural rather than set-based)

Question

How can I implement something like this in SQL?

Notes

  • I don't care to find the start / end dates of the most overlapping events. I just need a count
  • I don't care how often the maximum occurs, I just need the maximum. So, in the example above, I know there are two occasions where 3 events overlap, but I just need the "3".
  • If event A ends on the same day as event B starts, they are considered to be overlapping

Best Answer

It might be more performant to write this in code, not sql.

In code, I would sort the items by start date, then end date. Walk through them and check if it overlaps with the next item. If it does, increment its overlap counter and repeat: check the next item overlaps with your item. If it doesn't, move on to the next.

In SQL, you can do it with a self join to list all the overlaps.

This will show you all overlaps:

select a.eventid from events a
inner join events b 
on a.end > b.start and a.start < b.end

You can then group them by eventid, select the count and take the event with the max count.

select top 1 eventid, count(*) as c
from 
    (select a.eventid from events a
    inner join events b 
    on a.end > b.start and a.start < b.end)
group by eventid
order by c desc
Related Topic