Part of what you are looking for is a "priority queue". At previous employers, we did a very primitive version of this, but my heuristic was to only allow some processors to handle short running jobs (short jobs could take minutes), while others were handling longer running jobs (the quarterly report could take almost 2 days to process). This guaranteed that short jobs always had processing time available. I also used a scoreboard that listed jobs ready to be run, and the first processor able to handle the task would pick it up and run it single threaded (they were cheap computers that hadn't been depreciated and so could not be discarded). Many folks use the opposite: a scheduler that tells processors which work unit to do next. My advice would be to have each instance running a single task - this drastically simplifies the scheduling.
Scheduling of arbitrary jobs of arbitrary lengths is a hard problem in distributed processing. Almost every decision is going to involve simulating lots of runs. Which is one of the quirks of queuing theory, which this stuff is going to be based on.
One of the other devs suggested we leave the "job processors" alone to just pull whatever is in the queue next or "round robin". I say that this could lead to a potential issue where a single instance has pulled down too many large jobs and is struggling to get them done while the other instances are idle.
This needs simulation to answer. My earlier scheme used something very similar. If you have stats on previous job runs, you can model it in Excel. I've picked up this book from another post recommending it and am looking to learn some techniques to better be able to answer problems like what you're describing. Actual numbers trump everything, so gather data and do simulations based on them.
I don't have all the answers. Hopefully I can shed some light on it.
To simplify my previous statements about .NET's threading models, just know that Parallel Library uses Tasks, and the default TaskScheduler for Tasks, uses the ThreadPool. The higher you go in the hierarchy (ThreadPool is at the bottom), the more overhead you have when creating the items. That extra overhead certainly doesn't mean it's slower, but it's good to know that it's there. Ultimately the performance of your algorithm in a multi-threaded environment comes down to its design. What performs well sequentially may not perform as well in parallel. There are too many factors involved to give you hard and fast rules, they change depending on what you're trying to do. Since you're dealing with network requests, I'll try and give a small example.
Let me state that I am no expert with sockets, and I know next to nothing about Zeroc-Ice. I do know about bit about asynchronous operations, and this is where it will really help you. If you send a synchronous request via a socket, when you call Socket.Receive()
, your thread will block until a request is received. This isn't good. Your thread can't make any more requests since it's blocked. Using Socket.Beginxxxxxx(), the I/O request will be made and put in the IRP queue for the socket, and your thread will keep going. This means, that your thread could actually make thousands of requests in a loop without any blocking at all!
If I'm understanding you correctly, you are using calls via Zeroc-Ice in your testing code, not actually trying to reach an http endpoint. If that's the case, I can admit that I don't know how Zeroc-Ice works. I would, however, suggest following the advice listed here, particularly the part: Consider Asynchronous Method Invocation (AMI)
. The page shows this:
By using AMI, the client regains the thread of control as soon as the invocation has been sent (or, if it cannot be sent immediately, has been queued), allowing the client to use that thread to perform other useful work in the mean time.
Which seems to be the equivalent of what I described above using .NET sockets. There may be other ways to improve the performance when trying to do a lot of sends, but I would start here or with any other suggestion listed on that page. You've been very vague about the design of your application, so I can be more specific than I have been above. Just remember, do not use more threads than absolutely necessary to get what you need done, otherwise you'll likely find your application running far slower than you want.
Some examples in pseudocode (tried to make it as close to ice as possible without me actually having to learn it):
var iterations = 100000;
for (int i = 0; i < iterations; i++)
{
// The thread blocks here waiting for the response.
// That slows down your loop and you're just wasting
// CPU cycles that could instead be sending/receiving more objects
MyObjectPrx obj = iceComm.stringToProxy("whateverissupposedtogohere");
obj.DoStuff();
}
A better way:
public interface MyObjectPrx : Ice.ObjectPrx
{
Ice.AsyncResult GetObject(int obj, Ice.AsyncCallback cb, object cookie);
// other functions
}
public static void Finished(Ice.AsyncResult result)
{
MyObjectPrx obj = (MyObjectPrx)result.GetProxy();
obj.DoStuff();
}
static void Main(string[] args)
{
// threaded code...
var iterations = 100000;
for (int i = 0; i < iterations; i++)
{
int num = //whatever
MyObjectPrx prx = //whatever
Ice.AsyncCallback cb = new Ice.AsyncCallback(Finished);
// This function immediately gets called, and the loop continues
// it doesn't wait for a response, it just continually sends out socket
// requests as fast as your CPU can handle them. The response from the
// server will be handled in the callback function when the request
// completes. Hopefully you can see how this is much faster when
// sending sockets. If your server does not use an Async model
// like this, however, it's quite possible that your server won't
// be able to handle the requests
prx.GetObject(num, cb, null);
}
}
Keep in mind that more threads != better performance when trying to send sockets (or really doing anything). Threads are not magic in that they will automatically solve whatever problem you're working on. Ideally, you want 1 thread per core, unless a thread is spending much of its time waiting, then you can justify having more. Running each request in its own thread is a bad idea, since context switches will occur and resource waste. (If you want to see everything I wrote about that, click edit and look at the past revisions of this post. I removed it since it only seemed to cloud the main issue at hand.)
You can definitely make these request in threads, if you want to make a large number of requests per second. However, don't go overboard with the thread creation. Find a balance and stick with it. You'll get better performance if you use an asynchronous model vs a synchronous one.
I hope that helps.
Best Answer
Premature optimization is the root of all evil. If it works, don't fix it, and if you are curious if it scales then don't lose sleep over it and just test it.
It doesn't matter if you'll ever release the software, there are numerous tools that can help you run automated load and stress tests, simulating thousands or even millions of connections. It really is as simple as that, if your tests show there's a problem, fix it, if not, move on.