How to mod rewrite unicode byte sequence for the multibyte hyphen character

.htaccessmod-rewrite

We have case where some adobe pdf files format the hyphen character as %E2%80%90. See http://forums.adobe.com/message/2807241 this is caused by the Calibri font I guess.

So these pdf files have been released and the links don't work So I thought mod rewrite would come to the rescue.

I followed this post here mod_ReWrite to remove part of a URL but I can't seem to search for the % characters according to this question.

Is there anything else I can do?

Here is the rewrite rule I want to use:

RewriteRule ^foo%(.+)bar  /foo-bar [L,R=301]

I also tried this and it doesn't work

RewriteRule ^foo%E2%80%90bar  /foo-bar [L,R=301]

Any Ideas?

Best Answer

Using the answer from this question, I was able to come up with this .htaccess rule which fixed my own unicode-hyphen-links-in-pdfs problem:

# for janky pdfs with links using unicode hyphens
RewriteRule ^([^_]*)\x25E2\x2580\x2590([^_]*_.*) $1-$2 [N]
RewriteRule ^([^_]*)\x25E2\x2580\x2590([^_]*)$ /$1-$2 [L,R=301]