Apache .htaccess – RewriteRule and ErrorDocument Configuration

.htaccessapache-2.4mod-rewriterewrite

I have a .htaccess file :

RewriteEngine On
DirectorySlash Off

RewriteCond %{REQUEST_URI}  !(\.css|\.js|\.png|\.jpg|\.mp4|\.ttf|\.eot|\.woff)$
RewriteRule ^(.*)$ ?page=$1 [NC,L,QSA]

ErrorDocument 404 /e404
ErrorDocument 403 /e404
ErrorDocument 500 /e500

And a PHP function which checks to see if the requested URI exists. If the URI exists in the database, it will get the contents and render the page, if not, it will get the 404 error page contents and render the page. (Everything is OK here)

The problem is when I use a URI with extensions which are inside the RewriteCond part of the .htaccess file, something like example.com/file.css. In this case, the page gets redirected to example.com/e404 (that's what I want). The problem here is:

I don't know how to get the "wrong URI" used before redirecting to example.com/e404, because when the error page is loaded, I have a function inside the page to insert the "wrong URI" in the database.

Best Answer

Aside: The directives you've posted would create a rewrite loop on any default Apache installation, so maybe you have other directives or a different config that you aren't disclosing in your question? Anyway, ignoring this for the moment, since you don't appear to be having a problem like this.

I don't know how to get the wrong URI used before redirecting to example.com/e404 ...

But do you need to get this information before "redirecting"? In PHP you can simply check the $_SERVER['REQUEST_URI'] superglobal to get this information (the URL that resulted in the 404). However, note that this contains the slash prefix and query string (if any), which your page URL parameter would not otherwise contain.

Also, just to clarify terminology, this is not strictly a "redirect" to example.com/e404. A "redirect" implies an external HTTP redirect. Apache issues an internal subrequest for the error document (which is similar to a URL rewrite, but not quite).

One of the potential issues with your code is that the 404 response is served in a rather round-about way:

  1. If file.css does not exist then Apache triggers an internal subrequest for /e404
    (this triggers another round of processing...)
  2. Since /e404 does not end in one of the stated file extensions, it is further rewritten to ?page=e404 (presumably index.php?).

Because of the (unnecessary) 2nd rewrite, other server variables, such as REDIRECT_URL and REDIRECT_QUERY_STRING (which are also passed through to the PHP $_SERVER superglobal) are not set as expected. Ordinarily, these would be set to the "wrong URI" (ie. the requested URL that triggered the 404), but instead, they contain details of the error document itself.

You should do the above in 1 step and avoid the second rewrite. For example (assuming index.php is the file handling the request), the error document should be set to the desired end result:

ErrorDocument 404 /index.php?page=e404

Incidentally, you don't necessarily need to explicitly pass the HTTP "error status" (ie. e404) in the URL, as this is available in PHP's $_SERVER['REDIRECT_STATUS'] superglobal.

RewriteRule ^(.*)$ ?page=$1 [NC,L,QSA]

I've assumed you are rewriting to index.php (the registered DirectoryIndex)? Instead of falling back to let mod_dir issue an additional subrequest for the DirectoryIndex, you should be explicit and include the file you are rewriting to in the substitution. For example:

RewriteRule ^(.*)$ index.php?page=$1 [QSA,L]

(The NC flag is superfluous here.)

I don't know how to get the wrong URI used before redirecting to example.com/e404 ...

Stepping back to address your initial query again (as an academic exercise). You can do this on Apache 2.4.13+ (although I don't think you need to). On Apache 2.4.13+ you can create a dynamic ErrorDocument using Apache expressions.

For example, to explicitly pass the URL-path that caused the 404 error in the URL for the error document itself then you can do something like the following:

ErrorDocument 404 /index.php?page=%{escape:%{REQUEST_URI}}&error=404

The URL param page then contains the URL-path (which includes the slash prefix, unlike your rewrite that excludes it). An additional error URL param sends the HTTP status (although, as mentioned above, this is not strictly necessary).

Related Topic