HTTPS load balancing based on some component of the URL

apache-2.2load balancingpound

We have an existing application that we wish to split across multiple servers (for example: 1000 users total, 100 users split across 10 servers).

Ideally, we'd like to be able relay the HTTPS requests to a particular server based on some component of the URL. For example:
Users 1 through 100 go to http://server1.domain.com/
Users 2 through 200 go to http://server2.domain.com/
etc. etc. etc.

Where the incoming requests look like this:
https://secure.domain.com/user/{integer user # goes here}/path/to/file

Does anyone know of an easy way to do this? Pound looks promising… but it doesn't look like it supports routing based on URL like this.

Even better would be if it didn't need to be hard-coded- The load balancer could make
a separate HTTP request to another server to ask "Hey, what server should I
relay to for a request to URL {the URL that was requested goes here}?" and relay to the hostname returned in the HTTP response.

Best Answer

Varnish would probably do it. As with other options mentioned here you'd need something like pound in front of it to act as an SSL terminator. However, once done, you can setup each real server as a "backend" and then add something like the following into the config:

## Define the back end servers.
backend server01 {
    .host = "192.0.2.1";
    .port = "80";
}
backend server02 {
    .host = "192.0.2.2";
    .port = "80";
}

sub vcl_recv {
    if (req.url ~ "^/1[0-9][0-9]/"){
        ## If the first part of the link is 100-199 use server01
        set req.backend = server01;
        pipe;
    } else if (req.url ~ "^/2[0-9][0-9]/") {
        ## If the first part of the link is 200-299 use server02
        set req.backend = server02;
        pipe;
    } else {
        ## If all else fails fall back to server01 
        set req.backend = server01;
        pipe;
    }
}

This is just an extract of the relevant sections and there will probably be more required in the config. For example you could add in the following just after sub vcl_recv { in order to cache any static files so that the servers aren't hit every time for files that don't change.

if (req.request == "GET" && req.url ~ "\.(png|jpg|gif|css)$") {
        lookup;

You can even add in little inline C programs to the config to talk to an external service and decide which backend to use.