Redirect all .txt files at the root dir to 404 (Except robots.txt)

.htaccess

I'm trying to configure my .htaccess to redirect all txt files at the root dir to 404 Error. (Except robots.txt)

I tried using
RedirectMatch 404 [^robots]\.txt$
But it redirects also txt files on my sub directories.

Thanks.

Best Answer

How about:

ErrorDocument 404 /404.php
RewriteEngine On
RewriteRule ^/robots\.txt$ /robots.real [L]
RewriteRule ^/[^/]*\.txt$ /404.php [L]

where 404.php is a document returning 404 and robots.real is the name of your robots.txt.

Omit the first ErrorDocument statement if you don't want to design your own error message page, but it is usually nice to have one, because you can then have it in your own style, as well as perform logic in it to catch misspellings and so on.

Come to think of it, you probably don't need to create a 404 page at all. If you use mod_rewrite to have all txt documents fetch a non-existent page instead, such as:

RewriteRule ^/[^/]*\.txt$ a-page-that-does-not-exist.html [L]

The reason why your example in the question fails is because the regular expression [^robots]\.txt$ matches everything that ends in a character other than r,o,b,t,s followed by .txt. By adding ^/ before that, you will match all one-letter (except robts) .txt files in the root.

The ^ in the beginning of the regular expression means the beginning of the URI, and inside the brackets it means "not". Brackets mean "one of the characters inside".

If you add a *-sign to the regexp, it means zero-or-more occurrences of whats immediately before the *.

Thus, ^/[^r/obts]*\.txt$ will match any text documents in the root that have any number (including 0) of any characters except [r,/,o,b,t,s] ending in .txt, for instance /zzzfile.txt but not /mysecretfile.txt, because that contains both r and t and s. It also doesn't match /xyz/xyz.txt, due to the / inside the brackets. This is closer to what you tried to do, but it doesn't exclude just /robots.txt, which is what you want.

^/[^r/][^o/][^b/][^o/][^t/][^s/]\.txt$ matches any 6-characters except robots (plus .txt extension).