NGINX: Ignoring Certain URL Parameters for Cache Purposes

cachenginx

So say my NGINX cache key looks like this:

uwsgi_cache_key $scheme$host$request_method$request_uri;

… and that's mostly what I want. I want NGINX to make a cache key based on the entire URL, including the querystring. So that

https://example.com/?a=1&b=1

and

https://example.com/?a=1&b=2

… are separate pages, cached separately.

However, say that there are other parameters — c and d — that I don't want to affect the cache key. In other words, I want


Case 1

https://example.com/

and

https://example.com/?c=1

and

https://example.com/?c=2

and

https://example.com/?c=1&d=2

… to return the same page from the cache.


Case 2

And I want

https://example.com/?a=1

and

https://example.com/?a=1&d=2

and

https://example.com/?a=1&c=1&d=3

… to return the same page from the cache, which is different from the page in case 1.


I'm looking for a way to construct the uwsgi_cache_key so that it can account for these cases. I don't want to do it through redirects.

The number of parameters that I want to ignore when constructing the key — c and d, in this example — is limited; the number of number or parameters that I don't want to ignore is not.

How would you go about doing this? (Yes, this is mostly about fbclid and utm_* and their cousins.)


UPDATE:

Here is a rewrite of @tero-kilkanen's solution with map, in cases where fbclid and launcher are the undesired parameters. I don't know how much this slows down responses.

    map $args $cachestep1 {
        default $args;
        ~^(fbclid=[^&]*&?)(.*)$             $2;
        ~^([^&]*)(&fbclid=[^&]*)(&?.*)$     $1$3;
    }

    map $cachestep1 $cacheargs {
        default $cachestep1;
        ~^(launcher=[^&]*&?)(.*)$             $2;
        ~^([^&]*)(&launcher=[^&]*)(&?.*)$     $1$3;
    }

Best Answer

I haven't tested an approach like this, but I think it could work:

map $args $cacheargs {
    ~^(.*)a=.+&(.*)$ $1$2;
}

map $cacheargs $cacheargs1 {
    ~^(.*)b=.+&(.*)$ $1$2;
}

uwsgi_cache_key $scheme$host$request_method$uri$cacheargs1;

First map removes a=.+ from $args and records it to $cacheargs.

Second map removes b=.+ from $cacheargsand records it to$cacheargs1`.

Then $cacheargs1 is used as part of the cache key.

Original answer below.


You can use:

uwsgi_cache_key $scheme$host$request_method$uri$arg_a$arg_b;

This means that the cache key is built using normalized URI (without query arguments), and query arguments a and b.