I'm looking for a solution to cache 404s in long term (a few days/weeks) on the webserver. My current setup is NGINX with memcached_pass proxy and PHP-FPM to deliver uncached pages (PHP also writes the contents to memcached).
The crawlers all around the web seem to like my pages and generate a few thousand 404 requests a day. All of them hit PHP directly since I can't cache 404 response header informations together with the contents in memcached, hence the memcached_pass lookup always fails.
How can I cache all those requests that return a 404? Is the HTTPProxModule for Nginx what I'm looking for? Or should I rather go for Varnish?
From my current point of view, I'm not keen to change my entire setup and drop the memcached_pass directive from nginx. It's pretty neat so far, because php decides wheter a request can(should) be cached in memcached or not. It's also pretty easy to flush the cache when necessary.
My current NGINX configuration file:
server {
listen 80;
server_name _;
gzip on;
gzip_http_version 1.0;
gzip_vary on;
gzip_comp_level 6;
gzip_proxied any;
gzip_types text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
location / {
gzip on;
default_type "text/html; charset=utf-8";
charset utf-8;
add_header Content-Encoding gzip;
if ($request_method = GET)
{
expires max;
set $memcached_key $http_host$request_uri;
memcached_pass 127.0.0.1:11211;
error_page 404 = @fallback;
#error_page 502 = @fallback;
break;
}
root /var/www/html/;
index index.php index.html;
if (!-e $request_filename) {
rewrite ^/(.*)$ /index.php?q=$1 last;
break;
}
}
location @fallback {
internal;
root /var/www/html/;
index index.php index.html;
if (!-e $request_filename) {
rewrite ^/(.*)$ /index.php?q=$1 last;
break;
}
}
location ~ \.php$ {
root /var/www/html/;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /var/www/html/$fastcgi_script_name;
include /etc/nginx/fastcgi_params;
}
}
An example configuration either for Nginx or Varnish would be great.
Thank you! 🙂
Best Answer
Varnish caches 404's by default, so no configuration (except for initial, basic Varnish-configuration) is necessary - unless the backend provides a reply Varnish considers uncachable.
If that is the case, you can do the necessary changes to the reply using VCL, and force it to be cached.
I haven't provided any examples, because there are none to be given - really.