Multi-threading in Python – How to Detect Overuse

multithreadingpython

I currently feel like I am over-using multi-threading.

I have 3 types of data, A, B and C.

Each A can be converted to multiple Bs and each B can be converted to multiple Cs.

I am only interested in treating Cs.

I could write this fairly easily with a couple of conversion functions. But I caught myself implementing it with threads, three queues (queue_a, queue_b and queue_c). There are two threads doing the different conversions, and one worker:

  • ConverterA reads from queue_a and writes to queue_b
  • ConverterB reads from queue_b and writes to queue_c
  • Worker handles each element from queue_c

The conversions are fairly mundane, and I don't know if this model is too convoluted. But it seems extremely robust to me. Each "converter" can start working even before data has arrived on the queues, and at any time in the code I can just "submit" new As or Bs and it will trigger the conversion pipeline which in turn will trigger a job by the worker thread.

Even the resulting code looks simpler. But I still am unsure if I am abusing threads for something simple.

Best Answer

It is almost always simpler to think sequentially, and then later modify that logic to work better using threads. And, as the expression goes, "If it ain't broken, don't fix it." Most programmers don't use threads simply because there is no need to use them.

If you feel more comfortable using them, more power to you. However, know that if threads do not offer a speed boost by eliminating bottlenecks, they are almost certainly slowing down your program.

Also consider that systems which dedicate only one CPU to a process will simulate multiple threads by one single thread in order to save resources (this does not happen often with modern computers, though smart phone applications are still very much subjected to this abuse). In this case, even if you're eliminating bottlenecks through the use of threads, it will actually be slower than if you didn't use threads at all.

And, perhaps the most subtle reason to use caution to use threads, but certainly not the least important, threads have a tendency to do what you don't expect. Yes, if you're taking precautions, you should be okay. Yes, if your threads don't write to variables shared between threads, you should be okay. That said, thread-related bugs are very hard to find. Since I'm of the idea that a programmer cannot ever completely eliminate the possibility to create bugs in code and therefore a programmer should take measures to protect against possible bugs rather than focus on completely eliminating them, you should definitely apply this idea to hard-to-find thread bugs as well. In other words, know that despite your very best efforts, using threads in code will almost certainly create some very serious bugs sooner or later which you wouldn't have otherwise without using threads.

So should you use threads anyway? Well, a healthy knowledge of threads is certainly not a bad thing, especially if you become good at it. However, the movement of late has been towards single-threaded languages such as node.js. One of the main advantages of having a single thread is that it is easy to scale and certain optimizations can be made if you know that the instructions are expected to be run sequentially (even if optimizations may mean that instructions which can be run in parallel can be run asynchronously).

That said, I say do what is most comfortable for you. In my experience, writing a program that you understand has higher priority than making it work faster. Just be sure to use threads when you think it helps you write the program, and not because you want it to work faster, since you shouldn't be worrying so much about performance as you are writing the program (optimization is important, but it can also wait).

Related Topic