Multithreading – Is There a Limit on the Number of Threads That Can Be Spawned Simultaneously?

multithreadingprocess

Yesterday I came across this question: How can i call robocopy within a python script to bulk copy multiple folders?, and I though it might be a good exercise for multithreading.

I though of spawning as many threads as files needed to be copied, each routine having an exception handling system to prevent the whole copying process from crashing (and log -using mutex on the log file – if there was an error).

My question: Is there a limit on the number of thread you can spawn almost simultaneously? If yes, what is the limiting factor?

My question is focused on PC desktop, but I welcome any answer on different hardware (embedded systems, calculus clusters, etc.).

Best Answer

A quick google query revealed this article about limits of processes and threads in Windows. It seems pretty thorough, and there are other articles about pushing the limits in the same series.

I didn't read it, but from what I glanced, on a 32-bit system you will only be able to create a little over 2000 threads, and then you'll run out of memory (or rather, address space, because a 32-bit process has a 2GB address limit). And that's only for empty threads, which don't do anything.

Also, for a copy routine spawning such a huge number of threads would devastate performance. Not only would there be a huge amount of context switching going on, but also the source disk would get thrashed as all the threads tried to read from it at the same time. It might be better for an SSD, but it would still mean a huge IO queue for the system to maintain.

All in all I'm not saying that a file copy operation can't benefit from multithreading, but blindly creating a thread for each file doesn't solve anything. Because in a copy operation it's not the CPU that's the bottleneck - it's the disk (or the network, if you're copying over the network). So any optimization would need to address these problems first.