Scheduling – How to Make a Cluster Run a Task Only Once

Architecturelanguage-agnosticscheduling

If you had a task that you wanted to run only once on a cluster of servers, at a regular interval what would be the best way of achieving this? The definition of cluster in this case is 2 or more identical servers with distributed sessions sitting behind a load balancer.

Use Case: You have a task that is expensive to run that should only be run once per X hours. This job could for instance iterates over a bunch of records and updates their status.

  • Worst case scenario is that having the job run twice invalidates your data.
  • Best case scenario is that the job utilises resources on all your servers.

Requirements Summary:

  1. The job must still run even if one of the nodes are down.
  2. The job must only be run once per schedule.
  3. If multiple jobs are scheduled at the same time or at overlapping times that the number of running jobs is distributed equally between the servers.
  4. The machines must have the same code base and be synchronised via NTP.
  5. The configuration may differ between node and node, by environment variables.
  6. The job has to start on time or within a given interval of the assigned time. (say 5 minutes for example)

Possible solutions

  • Set one node as the master node, this doesn't work as it violates 1 above.
  • Make a request that the load balancer balances to kick off the job. Unfortunatly this has the side effect that if you have multiple jobs running at the same time they may all be run by the same machine.

This would have to run in Java, in a servlet container. However it isn't coding the jobs I'm looking for.

Surely this is a solved problem with known best solution.


Related question.
https://stackoverflow.com/questions/5949038/schedule-job-executes-twice-on-cluster

This isn't a duplicate as the solution is insufficient as per those 5 requirements given above. The most upvoted solution suffers from a race problem, and the second solution violates requirement 3

Best Answer

Do you have a shared database? I've done this using a database as the arbiter in the past.

Basically, each "job" is represented as a row in the database. You schedule a job by adding a row to the database with the time you want it to run then each server does:

SELECT TOP 1 *
FROM jobs
WHERE state = 'NotRun'
ORDER BY run_time ASC

That way, they'll all pick the job that is scheduled to run next. They all sleep so that they wake up when the job is actually supposed to run. Then, they all do this:

UPDATE jobs
SET state = 'Running'
WHERE job_id = :id
  AND state = 'NotRun'

Where :id is the identifier of the job you got in the step above. Because the update is atomic, only one of the servers will actually update the row, you can check the database's "number of rows updates" status code to determine whether you were the server that actually updated the row, and therefore whether you are the server that gets to run the job.

If you didn't "win" and you're not running the job, just go back to step 1 immediately. If you did "win", schedule the job to execute in another thread, then wait a couple of seconds before going back to step 1. That way, servers that didn't get the job this time are more likely to pick up a job that's scheduled to run immediately.

Related Topic