Apache 2.4.7 mod_proxy_wstunnel tunneling too much (HTTP as well as WS)

apache-2.4mod-proxyreverse-proxywebsocket

I'm running Apache 2.4.7 as a reverse proxy on Ubuntu 14.04 LTS. This Apache server acts as the entrypoint to a lot of different backend applications, which are accessed via different mod_proxy configurations in <Location> blocks

I need to provide reverse proxy access to an application that uses WebSockets. The application is a Java Spring application that serves HTML and other static files over HTTP, and then uses a WebSocket for dynamic data after the page has loaded.

I've got the application running behind Nginx using the following config:

location /newapp/ {
    proxy_pass http://newapp.example.com:8080/;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}

Unfortunately, due to needing an Apache authentication module that isn't available on Nginx, I can't use this in production.

What I want to do, in pseudo-Apache-config is:

<Location /newapp/>
    if not WebSockets:
        ProxyPass http://newapp.example.com:8080/
        ProxyPassReverse /
    else
        ProxyPass ws://newapp.example.com:8080/
        ProxyPassReverse /
</Location>

The Apache mod_proxy_wstunnel module makes me think that this should be possible. The WebSocket is accessed on the URL /api/socket/..., so I've tried separating the two types of ProxyPass using separate <Location> blocks:

<Location /newapp/>
    ProxyPass http://newapp.example.com:8080/ disablereuse=on
    ProxyPassReverse /

    ProxyPassReverseCookieDomain newapp.example.com apps.example.com
    ProxyPassReverseCookiePath http://newapp.example.com:8080/ /newapp/
</Location>

<Location /newapp/api/socket/>
    ProxyPass ws://newapp.example.com:8080/api/socket/
    ProxyPassReverse /
</Location>

This works initially – the browser requests http://apps.example.com/newapp/, the page loads over HTTP, the static assets are loaded, the JavaScript code connects to the websocket, everything is awesome.

However, when a new request is made over HTTP – say, for GET /newapp/static/someimage.png, something goes wrong. This request doesn't match against the WebSocket Location, so I would expect it to proxy GET /static/someimage.png through to http://newapp.example.com:8080/.

Instead, the application server receives a request for GET /newapp/static/someimage.png and returns a 404, as this isn't a URL that it's aware of. This breaks the application, as HTTP requests that should work fail instead.

Notes:

  • This doesn't just occur for images – GET /newapp/api/ajax/someapicall also gets proxied through incorrectly.
  • This doesn't always happen. While testing things to complete this question I managed to get the app to completely work. It might have been time based – I left the app running without interacting with it for a few minutes before making any new HTTP requests. When I did make new HTTP requests, they went through correctly.
  • Disabling the <Location /newapp/api/socket/> section causes two things to happen – the WebSocket fails to connect, and the HTTP requests continue to work.
  • I discovered this issue after refreshing the page via the browser's Refresh button. Instead of the page loading again, I saw the app's 404 screen.

What I think is happening:

I think mod_proxy_wstunnel, once activated by the first request to match /newapp/api/socket/, is taking over for all further inbound requests from the client, whether they match the Location or not. I tested this by adding a RequestHeader set Test "some_identifying_value" directive to each Location – the HTTP requests for static files & for /api/socket/info had the Test header on them, but the incorrectly-proxied HTTP requests did not have a Test
header on them, which suggests that they are being passed straight through without being processed by the Apache directives.

Ultimately, my question is this: Is it possible to configure any version of Apache (I'm happy to upgrade!) to reverse-proxy WebSocket-based applications in such a way that HTTP requests are also reverse-proxied correctly after the WebSocket has connected? If so, how is this configured?

Best Answer

anders' answer got me 95% of the way there.

The basic scenario:

  • We have a server on newapp.example.com
  • Port 8080 is running both HTTP and WebSockets
  • The URL that responds to WebSockets requests is /api/socket/
  • We're reverse-proxying this application as http://apps.example.com/newapp/

This is how to configure WebSockets and HTTP reverse-proxying for the above scenario in a <Location> block:

<Location /newapp/>
    ProxyPass http://newapp.example.com:8080/
    ProxyPassReverse /

    RewriteEngine on
    RewriteCond %{HTTP:UPGRADE} ^WebSocket$ [NC]
    RewriteCond %{HTTP:CONNECTION} Upgrade$ [NC]
    RewriteRule /api/(.*) ws://newapp.example.com:8080/api/$1 [P]
</Location>

The final rewrite rule is crucial - without it, we'll pass the request /newapp/api/socket through to the WebSocket server - which it will reject.

The regex is parsing out everything after api - there might be a better way to capture that block, but this worked. We then have to remember to re-add /api/ to the final redirect URL.

Most importantly, HTTP requests continue to work after the WebSocket connection is established!