SQL Database – How to Create a Timed-Event Architecture Using SQL Database

cevent-programmingpollingsql server

The title of my question is general because I feel like this problem is of a general nature, but to set the stage I'm going to provide a specific example.

We use a homegrown workflow engine that is driven by database tables. Within those tables lurks a directed graph that represents the workflow. The graph contains Stages and Activities; a line is drawn between two Stage nodes, and the resulting Activity node contains code to be executed. We use CSScript to compile and execute the code on the fly.

Within the workflow, Task records represent the work to be executed. Each Task contains some relevant metadata in XML form. The Task records traverse the directed graph, and the code is executed as the Task passes through the activity. So at any given moment, each stage might contain x number of tasks, waiting to be executed on an activity.

To execute a Task on an activity, it needs to be scheduled. A Schedule record containing a datetime, taskid, stageid and activityid determines when and where this Task gets executed next. Periodically, we execute a query that returns Schedule records that are due, and then for each record so returned we stand up an Activity instance and execute it, handing it the Task record as a parameter.

This query used to run 10 times per second. Recently, I added some code that counts how many times the query returns no records, and if this count gets to 60, I reduce the query interval to once per second, and start counting again. If the count reaches 60 again, I reduce the interval to once per minute. If a record appears in the query result, I set the interval back to 10 times per second, and begin the counting process again. The net effect is that the schedule table is rapidly polled during busy activity periods, and sparsely polled during quiet periods. We expect to save a few hundred dollars per Azure instance per month, just from this one simple change.

So here's my question.

This is obviously a polling pattern. Is there a way to make it "event-driven," so that the database is only hit when a schedule record is due, without having to constantly poll the database?

Best Answer

The general solution is to use a database that supports asynchronous notifications. Several do:

  • Oracle - Allows registration for notification of changes on objects (object change notification or OCN) and changes in the results of specified queries (query result change notification or QRCN).
  • PostgreSQL - Simple notification containing a tag and an optional payload generated using the NOTIFY statement as a standalone command or as part of a function. (The latter could be part of a trigger.) Clients can subscribe to notifications by issuing a LISTEN statement and selecting on the connection handle (exactly how varies with language binding).
  • SQL Server - Built-in queuing system where clients can use a combination of the WAITFOR and RECEIVE statements to listen for events. May also have (or have had) Oracle-like OCN/QRCN.
  • Sybase - Has registered procedures that allow invocation of callbacks on the clients if they've asked for it. (Not positive about this one.)

If you're stuck with one of those that doesn't (MySQL, DB2), it will have to be done out-of-band using one of the methods described in the other answers.

Once you have a method for the database to notify you that something has changed, you can do a query that determines how long it is until the next event is supposed to happen and then wait that long for a notification. If you get a notification, repeat the query/wait cycle. If you don't get a notification, it means the time you calculated has arrived and it's time to do whatever the event dictates. This should get you down to the point where you're only querying the database when you know for sure something needs to happen.

Related Topic