I hear threads are expensive. While that isn't an issue yet, I don't want to get into the habit of spawning a new thread for everything if there's a better way.
Threads have overhead. The longer a thread lives, the less of a concern the overhead is. For example, let us assume a thread has a ton of overhead and requires 100ms to spin up. However, the thread lives for an hour. That is a tiny fraction of its life, so who cares? If you can bury the thread initialization in application startup when users expect delays anyway, all the better. If the thread lived for less than a second, then the overhead is a bit much and maybe you should consider alternatives.
That being said, do not spawn threads to do nothing. Each thread should have a purpose.
I'm not sure if I should sacrifice "separation of concerns" to save using threads. In theory, the message receive/broadcast jobs could be combined; although they're separate duties.
In my experience, multithreaded applications need multiple threads. Combining concerns with e.g. a listener and a broadcaster in this case will increase complexity to the point of it being unmanageable. It will create bugs. The key here is where do the threads block? If you have a listener that blocks on accepting a socket, it makes no sense for it also to handle requests. One of its concerns will suffer: requests will be delayed, or new connections might fail. Just create more threads.
Note 1: you should be using thread primitives, not raw threads. Since you tagged this question java I will point to the specific class: ExecutorService. The Executors class provides several factory methods to make it easy to create them for various tasks. The basic idea is you submit tasks (Future or Runnable) and let the framework manage the threads.
This has two primary advantages:
Managing threads can be hard. While it appears easy, there is a lot of boilerplate logic that is easy to screw up, resulting in weird, difficult to replicate (and fix) bugs. Push this responsibility into a robust framework that is well-tested and used in millions of other programs. If there were bugs then Oracle (or Microsoft for .NET, or whatever other vendor) would have found them by now.
Your program's purpose is not "to manage threads" it is "to run a chat server." Let the framework handle the threading concerns, and focus on the chat server aspect. By focusing on the tasks and not the threads, your code is more concise, expressive, and clear.
I wrote up an answer to this old Stack Overflow question that shows a use case for this. I do not want to derail the focus on this answer too much, so you can read more here as well as search both Programmers and SO: Creating an unknown amount of threads in Java?
Note 2: given that this appears to be a desktop/server application as opposed to embedded, there is not much need to be concerned with spawning threads. Modern CPUs (amd64, Intel Core) have multiple cores on-die, typically between two and six. Some models are hyperthreaded or superscalar, adding more logical cores that can execute code concurrently. Add in time slicing with today's high clock speeds, and there is no need to concern oneself with spawning too many threads as long as the threads are reasonable. In the case of your question, the threads appear to be performing small bits of work and possibly blocking on I/O. You are not spawning a million threads for digital video encoding, for example.
Even on embedded architectures, speed and concurrency are not as strong as desktop and server CPUs but should have no problem handling a few threads. My smart phone and tablet both have many applications running in the background and have zero performance problems even with many applications and threads running.
Note 3: if you are interested in learning more about the tradeoffs of a single-threaded algorithm vs. splitting it into pieces and using concurrency, there are a few topics worth exploring. Given the problem described in your question this is not an immediate need, but could make for some interesting and educational reading.
MapReduce is an algorithm for splitting a task into pieces that do not rely on each other, performing those tasks, and joining the results together.
Parallelism in databases involves multiple threads querying data and joining the results. There has been a lot of research into the tradeoffs of using multiple threads for a single database query and a lot of information is out there. Outside of schema/query optimization this is very useful information from a more theoretical perspective.
I'd have a queue per physical machine.
Thus a backup process would take at most N=2 tasks, and make sure it does not pick a task from a queue for which another task as already running.
(N could be easily adjusted when needed.)
Best Answer
The design pattern that most closely matches your description would be a Producer Consumer pattern with a queue between each side.
For your exact specs you would have 4 threads consuming the queue which will limit to 4 items being worked on at one time.
Depending on the language there may be built in constructs for multi threaded queue concurrency (C# System.Collections.Concurrent for an example) or you could use an external queue server to manage locking for you (RabbitMQ, MSMQ etc).
Additionally there may be constructs like the background Thread Pool in .net which will manage some of the complexity of actually running the threads but that will depend highly on the language this will be developed in.
I use the producer consumer pattern extensively in a C# application that I developed using both in memory and external queues. As long as you take care of locking so that only a single consumer grabs each queue item it provides a very simple way to distribute work among threads.
(For my specific implementation I use System.Collections.Concurrent.BlockingCollection for some of my queues and System.Threading.Tasks (the Task Parallel Library) for managing my consumer threading.)