HAProxy reqrep – Remove URI on Backend Request

haproxyload balancingrewriteuri

real quick question regarding HAProxy reqrep. I am trying to rewrite/replace the request that gets sent to the backend.

I have the following example domain and URIs, both sharing the same domain name, but different backend web server pools.

http://domain/web1
http://domain/web2

I want web1 to go to backend webfarm1, and web2 to go to webfarm2. Currently this does happen. However I want to strip off the web1 or web2 URI when the request is sent to the backend.

Here is my haproxy.cfg

frontend webVIP_80
        mode http
        bind    :80
        #acl routing to backend
        acl web1_path path_beg /web1
        acl web2_path path_beg /web2

        #which backend
        use_backend webfarm1 if web1_path
        use_backend webfarm2 if web2_path
        default_backend webfarm1

backend webfarm1
        mode http
        reqrep ^([^\ ]*)\ /web1/(.*)     \1\ /\2
        balance roundrobin
        option httpchk HEAD /index HTTP/1.1\r\nHost:\ example.com
        server webtest1 10.0.0.10:80 weight 5 check slowstart 5000ms
        server webtest2 10.0.0.20:80 weight 5 check slowstart 5000ms
backend webfarm2
        mode http
        reqrep ^([^\ ]*)\ /web2/(.*)     \1\ /\2
        balance roundrobin
        option httpchk HEAD /index HTTP/1.1\r\nHost:\ example.com
        server webtest1-farm2 10.0.0.110:80 weight 5 check slowstart 5000ms
        server webtest2-farm2 10.0.0.120:80 weight 5 check slowstart 5000ms

If I go to http://domain/web1 or http://domain/web2 I see it in the error logs that the request on a server in each backend that the requst is for the resource /web1 or /web2 respectively. Therefore I believe there to be something wrong with my regular expression, even though I copied and pasted it from the Documentation. http://code.google.com/p/haproxy-docs/wiki/reqrep

Summary:
I'm trying to route traffic based on URI, however I want HAProxy to strip the URI when it sends the request to the backend pool.

Thank you!

-Jim

Best Answer

You have this:

reqrep ^([^\ ]*)\ /web1/(.*)     \1\ /\2

I think you want this:

reqrep ^([^\ ]*\ /)web1[/]?(.*)     \1\2

The difference being that the second one will work if the / after webN is omitted.

In answer to your comment below, going in to detail about how the expressions above work is more effort than I can give. However, maybe this will help.

Everything before /web1 is "capturing" everything that comes before web1 in the request string. So usually that would be GET or POST. The (.*) "captures" everything after web1, including nothing if there is nothing.

The next part (\1\2) says what to do with those captured parts. It says to form a string composed of \1 (the first captured part) and \2 (followed by the second captured part). Since web1 is never captured, it's not assembled in to the final output.