Job Scheduling – Algorithm for Maximum Overlapping Jobs

algorithms

Problem: you are given n jobs, each of which has a start and an end time. The task is to simply output the maximum number of active jobs (ones that overlap with one another, inclusive), and to return the time range of overlapping jobs.

For example (format: job = start, end):

J1 = 0, 1
J2 = 1, 1.2
J3 = 2, 2.5
J4 = 0.3, 0.7
J5 = 1.5, 2
J6 = 0.5, 1

Would have two overlapping ranges:

0 to 1.2, with four jobs (J1, J2, J4, J6)
1.5 to 2.5 with two jobs (J3, J5)

The first overlapping range would be the solution — 4 overlapping jobs with a time range of 0 to 1.2.

As you can see, the input may or may not be sorted, which is problematic for performance reasons. I am wondering if there is a standard algorithm for this problem because I am given a very large string that I have to process and find jobs that overlap.

My current approach is to treat it like an online problem, and therefore read the start and end times, then store it into a hashmap, and afterwards just read the rest of the inputs and if it falls in an existing interval, append it to the values, then change the end time if the end time is greater than the previous end time.

Am I on the right track? Thanks so much.

Best Answer

Put the start and end points of the job intervals into an array (together with the information to which job they belong, and if the entry is a start point or an end point). Then sort those items (in O(n*log n) time). Afterwards, it will be possible to determine the number (and the set) of overlapping jobs at any time by a linear scan through the sorted array: whenever a new interval starts, add the corresponding job to the set of active jobs. Whenever an interval ends, remove the corresponding one from the set.

That way, you can easily determine groups of jobs which are overlapping or non-overlapping (there must be a point in time where no job is active if two groups are separate), or the maximum number of simultaneous jobs (which is not what you asked for, but maybe of interest either).

I think this algorithm belongs to the class of sweep line algorithms.

Related Topic