Nginx – proxy_pass on user_agent

http-headersnginxproxypass

I have a SPA site, where I need to load the appropriate <meta> tags for each subpage, but it is not that easy from the subpages themselves so I made a separate address where Facebook or Twitter bots can download the appropriate OpenGraph values. It looks like this:

Original URL: http://website.com/contents/1
Route with OG tags for this URL: http://website.com/og/contents/1

I want to use proxy_pass for this, taking into account only the specificUser-Agent, however, the following configuration does not work, i.e. there is no redirection:

    location /contents {
        resolver 127.0.0.11 ipv6=off valid=5m;

        if ($http_user_agent ~* ("(facebookexternalhit)\/(.*)|(Twitterbot)\/(.*)")) {
            proxy_pass http://$host:8080/open-graph$request_uri;
        }
    }

Anyone see where is bad?

Best Answer

At first, when you need some string in your nginx config, you can use single or double quotes or don't use them at all (unless your string contains some special chars like spaces, curly brackets etc.) In your case nginx assumes that you don't use any quotes (because you string start with round bracket), so regex string processed by nginx is ("(facebookexternalhit)\/(.*)|(Twitterbot)\/(.*)") including round brackets and double quotes. It won't match any user agent unless it contains "facebookexternalhit/..." or "twitterbot/..." substrings with the double quotes.

In the second place, you don't need so many captures (in fact you don't need them at all cause you don't use them later). These captures make nginx spend some additional resources while matching the string against the regex (this isn't what you want on a high-load system). The following block should be work as you expected (note that the / symbol does not need to be escaped, although escaping it won't break the regex):

    if ($http_user_agent ~* (facebookexternalhit|twitterbot)/) {
        proxy_pass http://$host:8080/open-graph$request_uri;
    }

However, this isn't a good solution. It is better to avoid if constructions unless you use only ngx_http_rewrite_module directives inside the if block. In our case nginx will create two configurations - first one will be used if User-Agent string matches the regex, the second one will be used if it doesn't. I highly recommend not to use the if construction here. You can use the map translation instead:

map $http_user_agent $og_prefix {
    ~*(facebookexternalhit|twitterbot)/  /open-graph;
}

server {
    ...
    location /contents {
        resolver 127.0.0.11 ipv6=off valid=5m;
        proxy_pass http://$host:8080$og_prefix$request_uri;
    }
    ...
}

Value of the $og_prefix variable will be /open-graph if User-Agent string match the regex or an empty string otherwise.

Related Topic