Web-development – Servicing background tasks on a large site

designiis7multithreadingweb-development

We're dealing with an interesting problem on StackOverflow.

We've got a whole bunch of little "needs to be done soon-ish" tasks. An example is updating "Related Questions" lists. What we've done in the past is to piggy-back those tasks onto some users' page loads.

This was never ideal, but it wasn't really noticeable. Now that SO has passed the 1,000,000 question mark, those unlucky users are starting to feel it.

The natural solution is to actually push these tasks into the background. There are two broad ways of doing this I'm considering.

1. In IIS as a custom Thread-Pool/Work-Queue

Basically, we spin up a few (non-ThreadPool, so as to not interfere with IIS) threads and have them services some collections we're shoving Funcs into.

The big pro here is simplicity. We don't have to worry about marshaling anything, nor do we have to make sure some external service is up and responding.

We also get access to all of our common code.

The con is, well, that we shouldn't use background threads. The objections I know of are all centered around starving IIS (if you use ThreadPool) and the threads dieing randomly (due to AppPool recycling).

We've got existing infrastructure to make the random thread death a non-issue (its possible to detect a task has been abandoned, basically), and limiting the number of threads (and using non-ThreadPool threads) isn't difficult either.


Am I missing any other objections to in IIS process thread-pooling/work-queues?

Moved to StackOverflow, as it wasn't really addressed here.

2. As a Service

Either some third-party solution, or a custom one.

Basically, we'd marshal a task across the process boundary to some service and just forget about it. Presumably we're linking some code in, or restricted to raw SQL + a connection string.

The pro is that its the "right way" to do this.

The cons are that we're either very restricted in what we can do, or we're going to have to work out some system for keeping this service in sync with our code base. We'll also need to hook all of our monitoring and error logging up somehow, which we get for free with the "In IIS" option.

Are there any other benefits or problems with the service approach?

In a nutshell, are there unforseen and insurmountable problems that make approach #1 unworkable and if so are there any good third-party services we should look into for approach #2?

Best Answer

A few weeks back I asked a similar question on SO. In a nut shell, my approach for some time now has been to develop a Windows Service. I would use NServiceBus (essentially MSMQ under the covers) to marshal requests from my web app to my service. I used to use WCF but getting a distributed transaction to work correctly over WCF always seemed like a pain in the ass. NServiceBus did the trick, I could commit data and create tasks in a transaction and not worry whether my service was up and running at the time. As a simple example, if ever I needed to send an email (for example a registration email) I would create the user account and fire off a signal to my Windows Service (to send the email) in a transaction. The message handler on the service side would pick up the message and process accordingly.

Since ASP .NET 4.0 and AppFabric have been released, there are a number of viable alternatives to the mechanism above. Referring back to the question I mentioned above, we now have AppFabric's AppInitialize(via net.pipe) as well as ASP .NET 4.0's Auto-Start feature which make developing Windows Services as web apps a viable alternative. I have started doing this now for a number of reasons (biggest one being deployment is no longer a pain in the ass):

  1. You can develop a web UI over your service (since it's running as a web app). This is extremely useful to see what is happening at runtime.
  2. Your deployment model for your web apps will work for your service application.
  3. IIS provides a few neat features for handling application failures (similar in some respects to a Windows Service).
  4. Web developers are very familiar with developing web apps (naturally), most don't know much about best practice when developing a Windows Service.
  5. It provides a number of alternatives to exposing an API for other apps to consume.

If you go this route (forgive me for copying and pasting from my original post) I would definitely consider running the background logic in a separate web application. There are number of reasons for this:

  1. Security. There may be a different security model for the UI displaying information about the running background processes. I would not want to expose this UI to anyone else but the ops team. Also, the web application may run as a different user which has an elevated set of permissions.
  2. Maintenance. It would be great to be able to deploy changes to the application hosting the background processes without impacting on user's using the front end website.
  3. Performance. Having the application separated from the main site processing user requests means that background threads will not diminish IIS's capability to handle the incoming request queue. Furthermore, the application processing the background tasks could be deployed to a separate server if required.

Doing this gets back to the marshaling aspect. WCF, NServiceBus/RabbitMQ/ActiveMQ etc., vanilla MSMQ, RESTful API (think MVC) are all options. If you are using Windows Workflow 4.0 you could expose a host endpoint which your web app could consume.

The web hosting approach for services is still fairly new to me, only time will tell if it was the correct choice. So far so good though. By the way, if you don't want to use AppFabric (I couldn't because for some bizarre reason, Windows Server Web Edition ain't supported), the Auto-Start capability mentioned in the Gu's post works nicely. Stay away from the applicationhost.config file though, everything in that post is possible to setup through the IIS console (Configuration Editor on the main server level).

Note: I had originally posted a few more links in this message but alas, this is my first post to this exchange and only one link is supported! There was basically two others, to get them Google "Death to Windows Services...Long Live AppFabric!" and "auto-start-asp-net-applications". Sorry about that.