Mod_rewrite, clean URLs, and page content directories

apache-2.2mod-rewriterewrite

I am using mod_rewrite in an .htaccess file in our root web directory so that we can use cleaner URLs for some specific cases. For example, we want to translate:

http://example.com/Topic/ABCD/Type/Description

into

http://example.com/Topic.php?l=ABCD&e=Type&t=Description

This seemed easy enough; the .htaccess file consists of:

RewriteEngine on
RewriteRule ^Topic/([^/\.]+)/([^/\.]+)/([^/\.]+)/?$ /Topic.php?l=$1&e=$2&t=$3 [L]

However, all the page's content (.CSS files, images, etc.) ends up 404'ing because instead of, say, loading "/Common.css" from the directory relative to the rewritten URL:

http://example.com/Common.css

it loads it from the un-rewritten URL's directory:

http://example.com/Topic/ABCD/Type/Description/Common.css

which obviously does not exist. However, I've found countless tutorials all over the web that advocate some variant on what I'm doing (one example here), but make no mention of this problem I'm having, so I'm wondering if there's just some configuration I'm overlooking. They can't all rely on having these 'virtual' directories exist and contain the page's content. I also really don't want to make everything into absolute paths.

I have tried adding rules earlier in the .htaccess file to try to short-circuit content requests, such as:

RewriteRule \.(css|jpe?g|gif|png)$ - [L]

but it seems to have no effect. Regardless, I wouldn't have thought such content would even match the original rule, as the regex in that rule does not allow for anything following the (optional) trailing slash.

If I change the rule such that its flag is set to redirect ([R,L] instead of just [L]), then it all works fine — except now the browser displays the rewritten php parameter string, which is what we were trying to avoid. In the past, we've just accepted this as a work-around, but now I'm trying to understand why it's working this way, and what I can do about it.

In case it's relevant, I'm using CentOS 6.2, and Apache/2.2.15.

Best Answer

The problem is most likely that your code generates pages that include CSS like so:

<link rel="stylesheet" type="text/css" href="Common.css" />

Since the browser sees this and sees the URL it fetched as a subpath, it makes a request for Common.css at that subpath.

You could (but shouldn't, for reasons I'll explain in a moment) make some more rewrite rules that rewrite Common.css etc to the single one at the top level, eg:

RewriteEngine on
RewriteRule ^Topic/([^/.]+)/([^/.]+)/([^/.]+)/Common.css$ /Common.css [L]

HOWEVER, that would be silly, since the whole point of CSS is to allow for caching etc. A much better approach is to simply add a / at the front of Common.css in your code, so it produces CSS like so:

<link rel="stylesheet" type="text/css" href="/Common.css" />

That way, all the pages use the same Common.css file, and the browser doesn't have to fetch a copy of the same file in multiple logical locations on every page, and what's more the browser knows to fetch it at the top level, not on a subpage.

PS Jakub: the regex for the RewriteRule already excludes anything with a period in it, which already covers the .css, .jpg, etc, so adding the additional RewriteCond won't help.