Why Can’t a Server Act on a Request After Sending the Response?

Architectureasyncweb-applications

Say for example I've got a RESTful webservice, and I need to support creating a widget via a POSTed http request. I need to

  1. Deserialize the POSTed widget.
  2. Validate the deserialized widget.
  3. Persist the widget to a backing service.
  4. Send a success response back to the client.

Now, the problem is that my SLA does not allow 3 to block 4–persistence in this case takes too long and the client needs to know right away that the request has succeeded (this is the standard use case for the 202 http status code).

Almost everywhere I look, the assumption is that the way to solve this is to "background" the expensive persistence part. This is typically an awkward process with many moving parts and often its own latency (e.g. making a blocking call to a separate queuing service).

Simply parallelizing 3 and 4 using native constructs is generally out of the question, as is switching the order so that 4 blocks 3. As near as I can tell, this is mainly because the web and app servers are built with the fundamental assumption that a process (and any children it's forked off) is free to be killed/reused as soon as it's sent its response. Or, equivalently, that the response can't be sent until the app has finished doing everything it's going to do.

This is intensely frustrating to me! In any other context, I can do what I like with the program's control flow. But when I'm running a Phusion Passenger -> Ruby on Rails setup, and I want to do a thing after I send the response, I'm left with a wide variety of baroque options that all seem to consider it perfectly natural and acceptable to, say, serialize the application state, post it to Amazon SQS, have basically separate web service that polls SQS, deserializes the old application state, then do the thing. I had application state all set up the way I wanted it after sending the response! Why isn't anything written so I can just do

def create
  widget = Widget.new(params[:widget])
  if widget.valid?
    respond_with(widget)
    widget.save
  else
    respond_with(widget.errors)
  end
end

Why is there a pervasive assumption that web service stacks will never support this kind of flow? Is there a hidden drawback, or tradeoff, to making it possible to do this?

Best Answer

Now, the problem is that my SLA does not allow 3 to block 4--persistence in this case takes too long and the client needs to know right away that the request has succeeded (this is the standard use case for the 202 http status code).

But until the state change is persisted, you can't know that you're successful. You might have a sudden problem with electrical power and an errant backhoe (this stuff happens!) and then when the service resumes, it's completely forgotten about the resource. The client comes back and the server has no idea what its talking about. That's not good.

No, you need to speed up commits (and avoid unnecessary processing in the critical path).

You may need to think about how you can logically commit faster, about what it means to commit; you can just commit the fact that there is work to do instead of having to require the result of the work, which can be done much more rapidly. Then when the user comes back, either the processing is done in which case you can 301 to the results, or you can give a result that says why things are still processing or that they've failed.

Getting faster commits might mean thinking more carefully about how you deploy. Are you using the right database choice? Is that database hosted on the right hardware? (Commit-heavy loads are much faster when you've got an SSD to host the transaction log on.) I know it's nice to ignore these things and just deal with the data model at an abstract level, but performance is one of those things where the underlying details have a habit of leaking through.


In response to comments clarifying…
If you've got a genuinely expensive task to perform (e.g., you've asked for a collection of large files to be transferred from some third party) you need to stop thinking in terms of having the whole task complete before the response comes back to the user. Instead, make the task itself be a resource: you can then respond quickly to the user to say that the task has started (even if that is a little lie; you might just have queued the fact that you want the task to start) and the user can then query the resource to find out whether it has finished. This is the asynchronous processing model (as opposed to the more common synchronous processing model).

There are many ways to handle letting the user know that the task has finished. By far the simplest is to just wait until they poll and tell them then. Alternatively, you can push a notification somewhere (maybe an Atom feed or by sending an email?) but these are much trickier in general; push notifications are unfortunately relatively easy to abuse. I really advise sticking to polling (or using cleverness with websockets).

The case where a task might be quick or might be slow and the user has no way to know ahead of time is really evil. Unfortunately, the sane way to resolve it is to make the processing model be asynchronous; it's possible to do either way flipping with REST/HTTP (it's either sending a 200 or a 202; the 202's content would include a link to the processing task resource) but it is quite tricky. I don't know if your framework supports such things nicely.

Be aware that most users really do not understand asynchronous processing. They do not understand having to poll for results. They do not appreciate that a server can be doing things other than handling what they asked it to do. This means that if you are serving up an HTML representation, you probably ought to include some Javascript in it to do the polling (or connecting to a websocket, or whatever you choose) so that they can not need to know about page refreshing or details like that. Like that you can go a long way towards pretending that you're just doing what they asked it to do, but without all the problems associated with actual long-running requests.

Related Topic