If the objects are small and simple, copy by value might be the
fastest. However, I fear that it forces unnecessary limitations on the
implementation of the supported messages, so I want to avoid it.
If you can anticipate an upper bound char buf[256]
, e.g. A practical alternative if you cannot which only invokes heap allocations in the rare cases:
struct Message
{
// Stores the message data.
char buf[256];
// Points to 'buf' if it fits, heap otherwise.
char* data;
};
There are a number of important missing bits of information:
- Why is OpenMPI relevant?
- Why is heterogeneous relevant?
- What is the work that the server is orchestrating?
- You mention one thread per connection is a problem.
If you're using Boost::Asio and you're having the problem of one thread per client, then you're likely doing something wrong. Make sure that you're doing things asynchronously and not blocking any one thread too long. This is probably the easiest thing for you to do at the moment.
OpenMPI is not what you need. MPI was designed originally for distributed memory architectures. It is possible to use it over the internet, but it is probably not the best choice. The fact you mention heterogeneous and MPI makes me think that you've done a CS course in HPC. That is not relevant in this case.
When deciding on a protocol and architecture, the type work that the clients are doing is important, how you share state between the clients and server and how long each message takes to process.
If you're optimizing for throughput, then a per message overhead is less important, so verbose serialization formats like XML and JSON are ok. You can even tear down connections between message processing jobs - meaning that the server does not have to maintain a thread.
If you're trying to keep things low latency, then per message overhead is important, so maintaining a connection and using terse serialization formats.
A Messages Broker is an architectural pattern that you could consider. It is responsible for distributing your messages between your applications.
You're better off using internet-friendly protocols, like HTTP. You don't want to worry about Proxies or NAT traversal. The internet can be considered a massive distributed system with millions of heterogeneous clients. REST is an architectural style inspired by that so it might be appropriate.
I'm going to assume that:
- the server produces work item for clients to do and aggregates responses.
- your clients receives a message, does some work and sends a response back.
My default choice of technology, would be a web server to implement this. It would have two URLs. One where the clients can GET new work, the other for clients to POST responses.
When you initially start the client it should periodically poll for new work (you could use WS or Long Poll HTTP requests). When the server returns a work item, one client should consume it. Each work item should be stamped with a unique id, so that you can ensure that their are no duplicates and correlate responses with the initial job.
When the client has completed the task it should POST the response and restart the process of polling for new work.
Web Servers don't work well for long requests so you would want to hand it off to something else, perhaps via a DB, IPC or a message queue.
The server then does not have to worry about the capabilities of each client (so it makes heterogeneous irrelevant). The clients will just consume at the rate they can.
Best Answer
Since you can't have your cake and eat it (see comment thread for the question), I've decided on the following solution:
The network wrapper will keep a
boost::asio::streambuf
, while serialization/deserialization will happen overostream
/istream
. This way the message builders don't have to deal with memory management, can use as much/as little as they need and can choose whether they want to serialize as binary or text. As an added benefit, I can also easily serialize messages to/deserialize from a file for record/replay during debugging.