C# Concurrency – How Much Work to Place Inside a Lock Statement?

cconcurrencylocksnet

I am a junior developer working on writing an update for software that receives data from a third-party solution, stores it in a database, and then conditions the data for use by another third-party solution. Our software runs as a Windows service.

Looking at the code from a previous version, I see this:

        static Object _workerLocker = new object();
        static int _runningWorkers = 0;
        int MaxSimultaneousThreads = 5;

        foreach(int SomeObject in ListOfObjects)
        {
            lock (_workerLocker)
            {
                while (_runningWorkers >= MaxSimultaneousThreads)
                {
                    Monitor.Wait(_workerLocker);
                }
            }

            // check to see if the service has been stopped. If yes, then exit
            if (this.IsRunning() == false)
            {
                break;
            }

            lock (_workerLocker)
            {
                _runningWorkers++;
            }

            ThreadPool.QueueUserWorkItem(SomeMethod, SomeObject);

        }

The logic seems clear: Wait for room in the thread pool, make sure the service hasn't been stopped, then increment the thread counter and queue the work. _runningWorkers is decremented inside SomeMethod() inside a lock statement that then calls Monitor.Pulse(_workerLocker).

My question is:
Is there any benefit in grouping all the code inside a single lock, like this:

        static Object _workerLocker = new object();
        static int _runningWorkers = 0;
        int MaxSimultaneousThreads = 5;

        foreach (int SomeObject in ListOfObjects)
        {
            // Is doing all the work inside a single lock better?
            lock (_workerLocker)
            {
                // wait for room in ThreadPool
                while (_runningWorkers >= MaxSimultaneousThreads) 
                {
                    Monitor.Wait(_workerLocker);
                }
                // check to see if the service has been stopped.
                if (this.IsRunning())
                {
                    ThreadPool.QueueUserWorkItem(SomeMethod, SomeObject);
                    _runningWorkers++;                  
                }
                else
                {
                    break;
                }
            }
        }

It seems like, it may cause a little bit more waiting for other threads, but then it seems like locking repeatedly in a single logical block would also be somewhat time-consuming. However, I'm new to multi-threading, so I'm assuming that there are other concerns here that I'm unaware of.

The only other places where _workerLocker gets locked is in SomeMethod(), and only for the purpose of decrementing _runningWorkers, and then outside the foreach to wait for the number of _runningWorkers to go to zero before logging and returning.

Thanks for any help.

EDIT 4/8/15

Thanks to @delnan for the recommendation to use a semaphore. The code becomes:

        static int MaxSimultaneousThreads = 5;
        static Semaphore WorkerSem = new Semaphore(MaxSimultaneousThreads, MaxSimultaneousThreads);

        foreach (int SomeObject in ListOfObjects)
        {
            // wait for an available thread
            WorkerSem.WaitOne();

            // check if the service has stopped
            if (this.IsRunning())
            {
                ThreadPool.QueueUserWorkItem(SomeMethod, SomeObject);
            }
            else
            {
                break;
            }
        }

WorkerSem.Release() is called inside SomeMethod().

Best Answer

This is not a question of performance. It is first and foremost a question of correctness. If you have two lock statements, you can not guarantee atomicity for operations that are spread between them, or partially outside the lock statement. Tailored for the old version of your code, this means:

Between the end of the while (_runningWorkers >= MaxSimultaneousThreads) and the _runningWorkers++, anything at all may happen, because the code surrenders and re-acquires the lock in between. For example, thread A might acquire the lock for the first time, wait until there some other thread exits, and then break out of the loop and the lock. It is then preempted, and thread B enters the picture, also waiting for room in the thread pool. Because said other thread quit, there is room so it doesn't wait very long at all. Both thread A and thread B now go on in some order, each incrementing _runningWorkers and starting their work.

Now, there are no data races as far as I can see, but logically it's wrong, since there are now more than MaxSimultaneousThreads workers running. The check is (occasionally) ineffective because the task of taking a slot in the thread pool is not atomic. This should concern you more than small optimizations around lock granularity! (Note that conversely, locking too early or for too long can easily lead to deadlocks.)

The second snippet fixes this problem, as far as I can see. A less invasive change to fix the problem might be putting the ++_runningWorkers right after the while look, inside the first lock statement.

Now, correctness aside, what about performance? This is hard to tell. Generally locking for a longer time ("coarsely") inhibits concurrency, but as you say, this needs to be balanced against the overhead from the additional synchronization of fine-grained locking. Generally the only solution is benchmarking and being aware that there are more options than "lock everything everywhere" and "lock only the bare minimum". There is a wealth of patterns and concurrency primitives and thread-safe data structures available. For example, this seems like the very application semaphores were invented for, so consider using one of those instead of this hand-rolled hand-locked counter.