Unit-of-Work – Handling Concurrency in Unit of Work Pattern

unit-of-work

I am reading some information about the Unit of Work pattern. There is one thing that is not very clear for me: what will happen when you request a couple of records on a thread (1) and another thread (2) removes one of those records an commit the change?

Is the removed record on the first thread given a “removed” state? The problem with this approach is that the program has to check every time he want to use a record when it still exists. He also has to lock the record when he wants to use it.

I cannot think of any other approach, how is this problem solved in practice?

Best Answer

In general a unit of work corresponds with a database transaction. So from that perspective:

BEGIN TRANSACTION

-- do stuff

COMMIT TRANSACTION

Databases are very advanced with how they handle this. Any data you read after the BEGIN TRANSACTION statement is "locked" to any other thread (note that you can override this behavior in your SELECT statement with some explicit locking options, but I am ignoring that here). All other threads that want to access the same data will block until this transaction gets committed or rolled back. That's why it's important to make the work done inside the transaction as short as possible.

In practice, in an application using the repository pattern, there are two typical scenarios:

  1. Read-modify-write all within a transaction
  2. Read in a read-only transaction, modify-write in a later read-write transaction

For scenario 1, imagine you had a button called "Delete old records". The button would (roughly) do this:

using(var uow = GetUnitOfWork())
{
    var repository = uow.GetRepository<MyRecord>();
    var oldRecords = repository.Entities
        .Where(x => x.Old)
        .ToList();
    foreach(var record in oldRecords)
    {
        repository.Delete(record);
    }
    uow.Commit(); // sometimes this is optional (default behavior could be to commit)
}

Since all of the reading/writing is done inside of a unit of work, which is inside of a database transaction, then if two people clicked this button at the same time, there's no concurrency issue because the database will serialize the requests.

Scenario 2 is much more common, and trickier. Imagine you have an edit screen, like "edit employee". You typically don't want to keep a transaction open the whole time the user has the edit screen open, so first you get the employee information and render it to the screen. Then the user makes changes and clicks the "Save" button. It's reasonable to assume someone else was also editing that employee at the same time, so both started with the same copy of the employee, made 2 different changes, and then they click "Save" (one of them will click first).

If, inside the save routine, you reload the employee data, modify only the data that changed, and then commit it, then as long as the two users changed something different (user 1 changed the first name and user 2 change the last name) then it's possible that both changes will be made. However, if you write your save routine such that you copy all the data from your edit screen back into your employee entity and save it, then user 2's changes will overwrite user 1's changes (because user 2 still has the original first name in their screen). Furthermore, if both user 1 and user 2 changed the same field, then user 2's change will always overwrite user 1's change. This is not the right way to go about it.

In practice, using something like Entity Framework or NHibernate, there are two concepts: a session and a transaction. When your edit screen is created, you also create a session. The session keeps track of all data loaded from the database during that session (including what the original values were). The session is disposed of when the screen is closed (I'm considering a desktop application here, not a web application). When you click save, then it starts a transaction, but instead of reloading the data from the database, you just change the original employee entity that you loaded from the database, and then commit the transaction. The framework then does an UPDATE statement like this (assume the employee table only has 3 columns: "id", "first_name", "last_name"):

BEGIN TRANSACTION

UPDATE Employee
SET first_name = 'new first name'
WHERE id = 123
    AND first_name = 'old first name'
    AND last_name = 'last name'

COMMIT TRANSACTION

This is called optimistic concurrency. What that means is if someone else changed that record between when you loaded the edit screen and when you clicked save, then the update will fail (and the database will indicate this by saying "0 records updated" instead of "1 record updated"). The framework will detect this error and typically throw an exception (NHibernate throws a StaleObjectStateException). It's up to you to handle this exception. Honestly I just show a message like "someone else edited this record at the same time, so you'll have to close this screen and try again."

Note that there is an optimization you can do. You can add a new column to your table named VERSION. Then you can configure your framework to use that column instead of checking every single column in the table to see if they're changed, so the update would look like this:

BEGIN TRANSACTION

UPDATE Employee
SET first_name = 'new first name', 
    version = 2
WHERE id = 123
    AND version = 1

COMMIT TRANSACTION

So, you're relying on the database to make sure changes are atomic (indivisible) but you have to do some extra work to handle the human element, since people need to get a copy of the data, think about it for a while and make a change, and the data might have changed during that time. It's up to you as a programmer how you want to handle conflicts like that.

Related Topic