Nginx rewrite all html files except index

nginxrewrite

I'm having a bit of trouble with Nginx rewrites. I recently moved by blog over to a new engine, and the URL structure has changed. On my old blogging engine posts were located at a URL with the form http://$host/yyyy/mm/title.html however on the new engine they have the form http://$host/yyyy/mm/title/. The actual file that is being returned by the server resides at /yyyy/mm/title/index.html.

To make sure links in old posts still work I want to do a rewrite in Nginx that looks something like this:

rewrite ^/(\d\d\d\d)/(\d\d)/(.+)\.html$ $scheme://$host/$1/$2/$3/ permanent;

Unfortunately, this catches anything that ends with .html including index.html, so visiting a url of the form /(\d\d\d\d/)/(\d\d)/(.+)/ causes a redirect loop (nginx internally tries to use /$1/$2/$3/index.html, which redirects to /$1/$2/$3/, which redirects back to index.html, etc.).

I'd rather not use if statements if at all possible. Any ideas?

For reference the site is static, and my server config looks something like this (nothing fancy here):

server {
  listen      [::]:80;
  server_name blog.samwhited.com;
  root /var/www/blog.samwhited.com;
  charset utf-8;
  location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
    expires max;
    log_not_found off;
  }
  error_page 404 /error/404.html;
  error_page 403 /error/403.html;
  location /error/ { internal; }
}

Here's a quick example of the expected redirect behavior:

http://blog.samwhited.com/2008/09/test.html -> http://blog.samwhited.com/2008/08/test/

It would also be great if index.html could be hidden. Nothing I've found online has worked. So this would also work:

http://blog.samwhited.com/2008/09/test/index.html -> http://blog.samwhited.com/2008/09/test/

Best Answer

You pretty much need to use a if statement, but it's perfectly safe to do so if you only have rewrite directives inside it. You want to check the $uri variable since this includes modifications made by the index module.

if ($uri !~ /index\.html$) {
    rewrite ^/test/(\d\d\d\d)/(\d\d)/(.+)\.html$ /$1/$2/$3/ permanent;
}

To remove the index.html part you need to send a redirect, but only if the client specified index.html, and it wasn't added by the index module. This means we need another if statement to check this. We use the $request_uri variable here which is what the client actually sent.

if ($request_uri ~ /index\.html($|\?)) {
    rewrite ^(.*)/index\.html$ $1/ permanent;
}