Apache rewrite rules and special characters

apache-2.2encodingmod-rewriterewrite

I have a server where some files have an actual %20 in their name (they are generated by an automated tool which handles spaces this way, and I can't do anything about this); this is not a space: it's "%" followed by "2" followed by "0".

On this server, there is an Apache web server, and there are some web pages which links to those files, using their name in URLs like http://servername/file%20with%20a%20name%20like%20this.html; those pages are also generated by the same tool, so I (again!) can't do anything about that. A full search-and-replace on all files, pages and URLs is out of question here.

The problem: when Apache gets called with an URL like the one above, it (correctly) translates the "%20"s into spaces, and then of course it can't find the files, because they don't have actual spaces in their names.

How can I solve this?

I discovered than by using an URL like http://servername/file%2520name.html it works nicely, because then Apache translates "%25" into a "%" sign, and thus the correct filename gets built.

I tried using an Apache rewrite rule, and I can succesfully replace spaces with hypens with a syntax like this:

RewriteRule    (.*)\ (.*)      $1-$2

The problem: when I try to replace them with a "%2520" sequence, this just doesn't happen. If I use

RewriteRule    (.*)\ (.*)      $1%2520$2

then the resulting URL is http://servername/file520name.html; I've tried "%25" too, but then I only get a "5"; it just looks like the initial "%2" gets somewhat discarded.

The questions:

  • How can I build such a regexp to replace spaces with "%2520"?
  • Is this the only way I can deal with this issue (other than a full search-and-replace which, as I said, can't be done), or do you have any better idea?

Edit:

Escaping was the key, it works using this rule:

RewriteRule    (.*)\ (.*)      $1\%2520$2

But it only works if there is one "%20" in the initial URL; I get an "internal server error" if there is more than one.

Looks like I'm almost there… please help 🙂


Edit 2:

I was able to get it to work for two spaces using the following rule:

RewriteRule    (.*)\ (.*)\ (.*)     $1\%2520$2\%2520$3

This is enough for my needs, as URLs generated by the tool can only contain at most two "%20"s; but, out of curiosity: is there any way to make this work with any number of spaces? It works with the first rule if replacing any number of spaces with a normal character, this problem happens only when special characters are involved.

Best Answer

The % is being read as a back reference, so you need to escape the %.