Nginx – How to delete hyphen and underscore from url string in nginx

nginxrewriteurl

I have been trying to delete ANY/ALL hyphens – and underscores _ from an incoming URL request to my nginx server.

So to be clear, when someone enters a URL as follows:

https://www.example.com/my-name_is-tom

…I need for nginx to rewrite the URL as follows:

https://www.example.com/mynameistom

I am working with the following config:

server {
   listen 80;
   return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name top.example.com;

    ssl_certificate     /etc/ssl/top.example.com.crt;
    ssl_certificate_key /etc/ssl/top.example.com.key;

    # set the root
    root /srv/top.exemple.com;
    index index.html;


    location ~ ^/([a-zA-Z0-9=\?\_\-]+)$ {
        rewrite ^(/.*)-(.*)$ $1$2 last;
        rewrite ^(/.*)_(.*)$ $1$2 last;
        rewrite ^/(.*)$ / break;
    }

    location / {
        ssi on;
    }
    # BOSH
    location /http-bind {
        proxy_pass      http://localhost:0000/http-bind;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_set_header Host $http_host;
    }
}

…however, I see no re-writing taking place.

  1. Maybe I crafted that location rewrite thing wrong?

  2. Maybe I need to somehow rewrite the X-Forward-For $remote_addr; ???

Any insight / suggestions would be MUCH appreciated — I just don't know much about nginx and regexp.

Thank you all in advance for any time and attention.

EDIT/PS. It seems that I need some kind of rule that removes non-alphanumerics from $request_uri. So this:

example.com/my-name-is-tom.html

would be visually re-written in the browser URL field to:

example.com/mynameistomhtml

I realize how totally odd this sounds, but… that's what needs to happen.

Any further insight would be tremendously appreciated. TY!

Best Answer

To change the URL displayed in the client's browser's address field, you need an external redirect:

rewrite ^(.*)[-_](.*)$ $1$2 permanent;

If you need to restrict the scope of the rewrite, for example, so that /http-bind/ is not rewritten, you can make the regex more specific:

rewrite ^(/[^/]*)[-_]([^/]*)$ $1$2 permanent;

Explanation: capture and match leading slash followed by zero or more non-slash characters. Match hyphen or underscore. Capture and match zero or more non-slash characters.

Both rewrites will redirect with an HTTP 301 response, repeatedly, until all of the [-_] are removed.

Place the rewrite before the first location block.

If you place the rewrite inside a location block, ensure that the location matches the range of URIs that the rewrite is expected to rewrite. However, the rewrite rule is already quite specific, so the presence of a location block is fairly redundant.

See this document for more.