Why Can’t a Server Act on a Request After Sending the Response?

Architectureasyncweb-applications

Say for example I've got a RESTful webservice, and I need to support creating a widget via a POSTed http request. I need to

Deserialize the POSTed widget.
Validate the deserialized widget.
Persist the widget to a backing service.
Send a success response back to the client.

Now, the problem is that my SLA does not allow 3 to block 4–persistence in this case takes too long and the client needs to know right away that the request has succeeded (this is the standard use case for the 202 http status code).

Almost everywhere I look, the assumption is that the way to solve this is to "background" the expensive persistence part. This is typically an awkward process with many moving parts and often its own latency (e.g. making a blocking call to a separate queuing service).

Simply parallelizing 3 and 4 using native constructs is generally out of the question, as is switching the order so that 4 blocks 3. As near as I can tell, this is mainly because the web and app servers are built with the fundamental assumption that a process (and any children it's forked off) is free to be killed/reused as soon as it's sent its response. Or, equivalently, that the response can't be sent until the app has finished doing everything it's going to do.

This is intensely frustrating to me! In any other context, I can do what I like with the program's control flow. But when I'm running a Phusion Passenger -> Ruby on Rails setup, and I want to do a thing after I send the response, I'm left with a wide variety of baroque options that all seem to consider it perfectly natural and acceptable to, say, serialize the application state, post it to Amazon SQS, have basically separate web service that polls SQS, deserializes the old application state, then do the thing. I had application state all set up the way I wanted it after sending the response! Why isn't anything written so I can just do

def create
  widget = Widget.new(params[:widget])
  if widget.valid?
    respond_with(widget)
    widget.save
  else
    respond_with(widget.errors)
  end
end

Why is there a pervasive assumption that web service stacks will never support this kind of flow? Is there a hidden drawback, or tradeoff, to making it possible to do this?

Best Answer

Now, the problem is that my SLA does not allow 3 to block 4--persistence in this case takes too long and the client needs to know right away that the request has succeeded (this is the standard use case for the 202 http status code).

But until the state change is persisted, you can't know that you're successful. You might have a sudden problem with electrical power and an errant backhoe (this stuff happens!) and then when the service resumes, it's completely forgotten about the resource. The client comes back and the server has no idea what its talking about. That's not good.

No, you need to speed up commits (and avoid unnecessary processing in the critical path).

You may need to think about how you can logically commit faster, about what it means to commit; you can just commit the fact that there is work to do instead of having to require the result of the work, which can be done much more rapidly. Then when the user comes back, either the processing is done in which case you can 301 to the results, or you can give a result that says why things are still processing or that they've failed.

Getting faster commits might mean thinking more carefully about how you deploy. Are you using the right database choice? Is that database hosted on the right hardware? (Commit-heavy loads are much faster when you've got an SSD to host the transaction log on.) I know it's nice to ignore these things and just deal with the data model at an abstract level, but performance is one of those things where the underlying details have a habit of leaking through.

^{In response to comments clarifying…}
If you've got a genuinely expensive task to perform (e.g., you've asked for a collection of large files to be transferred from some third party) you need to stop thinking in terms of having the whole task complete before the response comes back to the user. Instead, make the task itself be a resource: you can then respond quickly to the user to say that the task has started (even if that is a little lie; you might just have queued the fact that you want the task to start) and the user can then query the resource to find out whether it has finished. This is the asynchronous processing model (as opposed to the more common synchronous processing model).

There are many ways to handle letting the user know that the task has finished. By far the simplest is to just wait until they poll and tell them then. Alternatively, you can push a notification somewhere (maybe an Atom feed or by sending an email?) but these are much trickier in general; push notifications are unfortunately relatively easy to abuse. I really advise sticking to polling (or using cleverness with websockets).

The case where a task might be quick or might be slow and the user has no way to know ahead of time is really evil. Unfortunately, the sane way to resolve it is to make the processing model be asynchronous; it's possible to do either way flipping with REST/HTTP (it's either sending a 200 or a 202; the 202's content would include a link to the processing task resource) but it is quite tricky. I don't know if your framework supports such things nicely.

Be aware that most users really do not understand asynchronous processing. They do not understand having to poll for results. They do not appreciate that a server can be doing things other than handling what they asked it to do. This means that if you are serving up an HTML representation, you probably ought to include some Javascript in it to do the polling (or connecting to a websocket, or whatever you choose) so that they can not need to know about page refreshing or details like that. Like that you can go a long way towards pretending that you're just doing what they asked it to do, but without all the problems associated with actual long-running requests.

How does introducing a queue affect your application's design?

Certain actions in your application generate emails. Introducing a message queue would mean that those actions should now push messages to the queue instead (and nothing more). Those messages should carry the absolute minimum amount of information that's necessary to construct the emails when your receiver gets to process them.

Format and content of the messages

The format and content of your messages is completely up to you, but you should keep in mind the smaller the better. Your queue should be as fast to write on and process as possible, throwing a bulk of data at it will probably create a bottleneck.

Furthermore several cloud based queueing services have restrictions on message sizes and may split larger messages. You won't notice, the split messages will be served as one when you ask for them, but you will be charged for multiple messages (assuming of course you are using a service that requires a fee).

Design of the receiver

Since we're talking about a web application, a common approach for your receiver would be a simple cron script. It would run every x minutes (or seconds) and it would:

Pop n amount of messages from the queue,
Process the messages (i.e. send the emails).

Notice that I'm saying pop instead of get or fetch, that's because your receiver is not just getting the items from the queue, it's also clearing them (i.e. removing them from the queue or marking them as processed). How exactly that will happen depends on your implementation of the message queue and your application's specific needs.

Of course what I'm describing is essentially a batch operation, the simplest way of processing a queue. Depending on your needs you may want to process messages in a more complicated manner (that would also call for a more complicated queue).

Traffic

Your receiver could take into consideration traffic and adjust the number of messages it processes based on the traffic at the time it runs. A simplistic approach would be to predict your high traffic hours based on past traffic data and assuming you went with a cron script that runs every x minutes you could do something like this:

if( 
    now() > 2pm && now() < 7pm
) {
    process(10);
} else {
    process(100);
}

function process(count) {
    for(i=0; i<=count; i++) {
        message = dequeue();
        mail(message)
    }
}

A very naive & dirty approach, but it works. If it doesn't, well, the other approach would be to find out the current traffic of your server at each iteration and adjust the number of process items accordingly. Please don't micro-optimize if it's not absolutely necessary though, you'd be wasting your time.

Queue storage

If your application already uses a database, then a single table on it would be the simplest solution:

CREATE TABLE message_queue (
  id int(11) NOT NULL AUTO_INCREMENT,
  timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  processed enum('0','1') NOT NULL DEFAULT '0',
  message varchar(255) NOT NULL,
  PRIMARY KEY (id),
  KEY timestamp (timestamp),
  KEY processed (processed)
)

It really isn't more complicated than that. You can of course make it as complicated as you need, you can, for example, add a priority field (which would mean that this is no longer a FIFO queue, but if you actually need it, who cares?). You could also make it simpler, by skipping the processed field (but then you'd have to delete rows after you processed them).

A database table would be ideal for 2000 messages per day, but it would probably not scale well for millions of messages per day. There are a million factors to consider, everything in your infrastructure plays a role in the overall scalability of your application.

In any case, assuming you've already identified the database based queue as a bottleneck, the next step would be to look at a cloud based service. Amazon SQS is the one service I used, and did what it promises. I'm sure there are quite a few similar services out there.

Memory based queues is also something to consider, especially for short lived queues. memcached is excellent as message queue storage.

Whatever storage you decide to build your queue on, be smart and abstract it. Neither your sender nor your receiver should be tied up to a specific storage, otherwise switching to a different storage at a later time would be a complete PITA.

Real life approach

I've build a message queue for emails that's very similar to what you are doing. It was on a PHP project and I've build it around Zend Queue, a component of the Zend Framework that offers several adapters for different storages. My storages where:

PHP arrays for unit testing,
Amazon SQS on production,
MySQL on the dev and testing environments.

My messages were as simple as they can be, my application created small arrays with the essential information ([user_id, reason]). The message store was a serialized version of that array (first it was PHP's internal serialization format, then JSON, I don't remember why I switched). The reason is a constant and of course I have a big table somewhere that maps reason to fuller explanations (I did manage to send about 500 emails to clients with the cryptic reason instead of the fuller message once).

Despite Repositories, these are frequently used as DAOs, the truth is that Spring developers took the notion of Repository from Eric Evans' DDD. Repository interfaces will look often very similar to DAOs because of the CRUD methods and because many developers strive to make repositories' interfaces so generics that, in the end, they have no difference with the EntityManager (the true DAO here)¹. Repositories tho add other abstractions like queries or criteria to enhance the data access.

Translated into Spring components, your design is similar to

@RestController > @Service > @Repository >  EntityManager

The Repository is already an abstraction in between services and data stores. When we extend Spring Data JPA repository interfaces, we are implementing this design implicitly. When we do this, we are paying a tax: a tight coupling with Spring's components. Additionally, we break LoD and YAGNI by inheriting several methods we might not need or wish not to have. Moreover, we are (implicitly) assuming that the persistence data model is also the domain data model. This is quite the endemic of Spring, after many years working with Spring frameworks, you end up making everything data-centric.

That said, extending Spring Data JPA repositories is not mandatory. We implement a more plain and custom hierarchy of classes.

    @Repository
    public class DBRepository implements MyRepository{
        private EntityManager em;
        
        @Autowire
        public MyRepository (EntityManager em){    
             this.em = em;
        }

        //Interface implentation
        //...
    }

Changing the data source now just takes a new implementation which replaces the EntityManager with a different data source.

    //@RestController > @Service > @Repository >  RestTemplate

    @Repository
    public class WebRepository implements MyRepository{
        private RestTemplate rt;
    
        @Autowire 
        public WebRepository (RestTemplate rt){    
             this.rt = rt;
        }

        //Interface implentation
        //...
    }

    //@RestController > @Service > @Repository >  File

    @Repository
    public class FileRepository implements MyRepository{
       
        private File file; 
        public FileRepository (File file){    
            this.file = file;
        }

        //Interface implentation
        //...
    }

    //@RestController > @Service > @Repository >  SoapWSClient

    @Repository
    public class WSRepository implements MyRepository{
       
        private MyWebServiceClient wsClient; 

        @Autowire
        public WSRepository (MyWebServiceClient  wsClient){    
               this.wsClient = wsClient;
        }

        //Interface implentation
        //...
    }

and so on.²

Back to the question, I don't think you need more layers. The layer you propose is going to end up as a proxy in between services and repositories or as a pseudo-service-repository where to place code you are not sure whether it belongs to the business or to the persistence.

^{1: Unlike many developers think, repository interfaces can be totally different from each other because each repository serves different domain needs. In Spring Data JPA, the role DAO is played by the EntityManager. It manages the sessions, the access to the DataSource, mappings, etc.}

^{2: A similar solution is enhancing Spring's repository interfaces mixing them up with custom interfaces. For more info, look for BaseRepositoryFactoryBean and @NoRepositoryBean. However, I have found this approach cumbersome and confusing.}

Best Answer

Related Solutions

How to Manage Automated Emails Sent from a Web Application

How does introducing a queue affect your application's design?

Format and content of the messages

Design of the receiver

Traffic

Queue storage

Real life approach

Further reading

Architecture – Should I use a layer between service and repository for a clean architecture – Spring

Related Topic