PHP and MySQL – Identifying a Race Condition

MySQLPHP

I have a friend arguing this is a race condition, but I'm struggling to understand how.

This is for a web application using PHP and MySQL where users can login and create tickets, these tickets have tasks which users can add. Each task has a time spend variable.

When the user submits a task, it adds a row with the task data to a table. After the entry has been added, it calculates the total time spent for all tasks for this particular ticket, and updates a field in a different table with that value.

The argument my friend puts forth is that with many users, there will be race condition issues as users are simultaneously adding tasks for tickets.

But my defence is that as long as the update for total time spent is run after the insert query, it will still update the field correctly regardless of how many users are adding tasks at the same time.

Best Answer

If you don't use transactions, then yes, there is a minor race condition that can result in having the incorrect value for the total time spent for a brief while. Consider for example the following timing of events:

[task 1 added]
[total selected]
                [task 2 added]
[total updated]
                [total selected]
                [total updated]

Now the client 1 selects the total amount, but a new task is added after this and for a brief while, the total contains an incorrect value. However, soon after the client 2 will update the total to the correct value.

Can the tasks be deleted in addition to adding them? If so, the situation might become more complex and requires further analysis.

You should really use transactions in this use case. This ensures that the updating of the total and the addition of the task happen simultaneously when viewed from outside of the transaction. Using a transaction with a proper isolation level eliminates the minor race condition.

You should note, however, that while transactions help in this use case there are various transaction isolation levels and all potential race conditions in don't disappear unless your database truly supports the SERIALIZABLE isolation level and you are actually using it. Many databases have the default isolation level set to something else than SERIALIZABLE and do not actually support the true SERIALIZABLE isolation level while they silently accept it and offer something worse. To read about transaction isolation levels, see e.g. http://www.postgresql.org/docs/9.1/static/transaction-iso.html (this is for PostgreSQL, not MySQL)

For an example of the case when the race condition can occur despite using transactions, see this timing:

[transaction 1 begins]
                       [transaction 2 begins]
[task 1 added]
                       [task 2 added]
[total selected]
                       [total selected, doesn't yet see transaction 1]
[total updated]
                       [total updated, waits for lock of transaction 1]
[transaction 1 commits]
                       [transaction 2 commits]

Now transaction 2 updates the total but doesn't yet see the row added by transaction 1, and thus the total ends up containing only the value of task 2 but not the value of task 1. This happens if you're not using the SERIALIZABLE isolation level. If, on the other hand, you are using the SERIALIZABLE isolation level on recent PostgreSQL, one of the transactions would fail due to serialization failure and you would need to be prepared to retry.

To fix this without using SERIALIZABLE, you need to select a locking row for update (the SQL command SELECT ... FOR UPDATE):

[transaction 1 begins]
                       [transaction 2 begins]
[select for update]
                       [select for update, waits for transaction 1]
[task 1 added]
[total selected]
[total updated]
[transaction 1 commits]
                       [select for update wait done]
                       [task 2 added]
                       [total selected, doesn't yet see transaction 1]
                       [total updated, waits for lock of transaction 1]
                       [transaction 2 commits]

I'm not sure if MySQL supports the true SERIALIZABLE isolation level or if it's just snapshot isolation. There was at least some work put into it by Michael Cahill, but I'm not sure if that work got ever merged to the MySQL distribution.

Do note that for transactions to work properly, you need to use a transactional database engine. I.e. InnoDB instead of MyISAM.

Related Topic