It is almost always simpler to think sequentially, and then later modify that logic to work better using threads. And, as the expression goes, "If it ain't broken, don't fix it." Most programmers don't use threads simply because there is no need to use them.
If you feel more comfortable using them, more power to you. However, know that if threads do not offer a speed boost by eliminating bottlenecks, they are almost certainly slowing down your program.
Also consider that systems which dedicate only one CPU to a process will simulate multiple threads by one single thread in order to save resources (this does not happen often with modern computers, though smart phone applications are still very much subjected to this abuse). In this case, even if you're eliminating bottlenecks through the use of threads, it will actually be slower than if you didn't use threads at all.
And, perhaps the most subtle reason to use caution to use threads, but certainly not the least important, threads have a tendency to do what you don't expect. Yes, if you're taking precautions, you should be okay. Yes, if your threads don't write to variables shared between threads, you should be okay. That said, thread-related bugs are very hard to find. Since I'm of the idea that a programmer cannot ever completely eliminate the possibility to create bugs in code and therefore a programmer should take measures to protect against possible bugs rather than focus on completely eliminating them, you should definitely apply this idea to hard-to-find thread bugs as well. In other words, know that despite your very best efforts, using threads in code will almost certainly create some very serious bugs sooner or later which you wouldn't have otherwise without using threads.
So should you use threads anyway? Well, a healthy knowledge of threads is certainly not a bad thing, especially if you become good at it. However, the movement of late has been towards single-threaded languages such as node.js. One of the main advantages of having a single thread is that it is easy to scale and certain optimizations can be made if you know that the instructions are expected to be run sequentially (even if optimizations may mean that instructions which can be run in parallel can be run asynchronously).
That said, I say do what is most comfortable for you. In my experience, writing a program that you understand has higher priority than making it work faster. Just be sure to use threads when you think it helps you write the program, and not because you want it to work faster, since you shouldn't be worrying so much about performance as you are writing the program (optimization is important, but it can also wait).
There are different ways to do it, but if you are inclined to stick with POCO, you may want to look at the macchina.io (OSP portion) WebEvent implementation - it is essentially a pub/sub messaging framework. There's more there than what you need but it's relatively simple and architecturally you should be able to quickly tailor it to your needs. I have used it in production for many years and it works well; it will also be ported in an OSP-independent form to Poco for one of the next releases.
Client can be either (1) a web socket endpoint or (2) an in-process observer which can send (i.e. post events) data and/or subscribe (i.e. receive notifications) to one or more subjects (topics). You'll probably need many in-process observers and one remote endpoint.
The framework runs in two threads handling:
Each queue is dealt with in its own thread and there is a dotted-notation naming scheme for subject names, see here for details. Note that documentation only mentions WebSockets but naming works exactly the same for in-process observers and you may want or need a different naming scheme.
Best Answer
I assume you are using nginx as a reverse proxy. In such case any parallel connections that come to you app are queued by operating system. You should not even count nginx in your equation because you can never predict when a connection comes. What you control though is the moment of accepting/handling it.
Truth is that usually listening for connections is handled in a single thread, in some kind of a loop. It is the handling part may or may not be done in a multi-threaded.
The simple and safe design below.
With such approach there is no need to worry about multiple threads. If a different connection comes in the middle of handling the first one then it will simply be queued and have to wait.
You can improve this if you write your app in a way that the relevant state is stored in a separate entity/database and then you may simply start several single-threaded C++ processes.
If you fork your C++ process or use a FastCGI backend you may share a single listening socket that will dispatch the incoming connections to several single-threaded C++ app instances.
I would strongly advice a multi-process approach due to the "leaky" and "crashy" nature of C++. If you have multiple processes any one of them may be restarted and / or crash without compromising the whole system.
Incidentally you get thread-safety by design if handling only one connection at a time by a process. This would mimic Node.JS web servers which have a reputation for being snappy.