Multithreading – How to Program Thread Allocation on Multicore Processors

multi-coremultithreading

I would like to experiment with threads on a multi-core processor, e.g. to create a program that uses two different threads that are executed by two different processor cores.

However, it is not clear to me at which level the threads get allocated to the different cores. I can imagine the following scenarios (depending on operating system and programming language implementation):

  1. Thread allocation is managed by the operating system. Threads are created using OS system calls and, if the process happens to run on a multi-core processor, the OS automatically tries to allocate / schedule different threads on different cores.
  2. Thread allocation is managed by the programming language implementation. Allocating threads to different core requires special system calls, but the programming language standard thread libraries automatically handle this when I use the standard thread implementation for that language.
  3. Thread allocation must be programmed explicitly. In my program I have to write explicit code to detect how many cores are available and to allocate different threads to different core using, e.g., library functions.

To make the question more specific, imagine I have written my multi-threaded application in Java or C++ on Windows or Linux. Will my application magically see and use multiple cores when run on a multi-core processor (because everything is managed either by the operating system or by the standard thread library), or do I have to modify my code to be aware of the multiple cores?

Best Answer

Will my application magically see and use multiple cores when run on a multi-core processor (because everything is managed either by the operating system or by the standard thread library), or do I have to modify my code to be aware of the multiple cores?

Simple answer: Yes, it will usually be managed by the operating system or threading library.

The threading subsystem in the operating system will assign threads to processors on a priority basis (your option 1). In other words, when a thread has finished executing for its time allocation or blocks, the scheduler looks for the next highest priority thread and assigns that to the CPU. The details vary from operating system to operating system.

That said, options 2 (managed by programming language) and 3 (explicitly) exist. For example, the Tasks library and async/await in recent versions of .Net give the developer a much easier way to write parallelizable (i.e. that can run concurrently with itself) code. Functional programming languages are innately parallelizable and some runtimes will run different parts of the program in parallel if possible.

As for option 3 (explicitly), Windows allows you to set the thread affinity (specifying which processors a thread can run on). However, this is usually unnecessary in all but the fastest, response-time critical systems. Effective thread to processor allocation is highly hardware dependent and is very sensitive to other applications running concurrently.

If you want to experiment, create a long running, CPU intensive task like generating a list of prime numbers or creating a Mandelbrot set. Now create two threads in your favorite library and run both threads on a multi-processor machine (in other words, just about anything released in the last few years). Both tasks should complete in roughly the same time because they are run in parallel.