Force caching of handler output which actively resists caching

apache-2.4mod-cache

I'm trying to force caching of a very obnoxious piece of PHP script which actively tries to resist caching for no good reason by actively setting all the anti-cache headers:

Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Content-Type:  text/html; charset=UTF-8
Date:          Thu, 22 May 2014 08:43:53 GMT
Expires:       Thu, 19 Nov 1981 08:52:00 GMT
Last-Modified: 
Pragma:        no-cache
Set-Cookie:    ECSESSID=...; path=/
Vary:          User-Agent,Accept-Encoding
Server:        Apache/2.4.6 (Ubuntu)
X-Powered-By:  PHP/5.5.3-1ubuntu2.3

If at all avoidable I do not want to have to modify this 3rd party piece of code at all and instead just get Apache to cache the page for a while. I'm doing this very selectively to only very specific pages which have no real impact on session cookies or the like, i.e. which do not contain any personalised information.

CacheDefaultExpire      600
CacheMinExpire          600
CacheMaxExpire          1800
CacheHeader             On
CacheDetailHeader       On
CacheIgnoreHeaders      Set-Cookie
CacheIgnoreCacheControl On
CacheIgnoreNoLastMod    On
CacheStoreExpired       On
CacheStoreNoStore       On
CacheLock               On

CacheEnable disk /the/script.php

Apache is caching the page alright:

[cache:debug] AH00698: cache: Key for entity /the/script.php?(null) is http://example.com:80/the/script.php?
[cache_disk:debug] AH00709: Recalled cached URL info header http://example.com:80/the/script.php?
[cache_disk:debug] AH00720: Recalled headers for URL http://example.com:80/the/script.php?
[cache:debug] AH00695: Cached response for /the/script.php isn't fresh. Adding conditional request headers.
[cache:debug] AH00750: Adding CACHE_SAVE filter for /the/script.php
[cache:debug] AH00751: Adding CACHE_REMOVE_URL filter for /the/script.php
[cache:debug] AH00769: cache: Caching url: /the/script.php
[cache:debug] AH00770: cache: Removing CACHE_REMOVE_URL filter.
[cache_disk:debug] AH00737: commit_entity: Headers and body for URL http://example.com:80/the/script.php? cached.

However, it is always insisting that the "cached response isn't fresh" and is never serving the cached version. I guess this has to do with the Expires header, which marks the document as expired (but I don't know whether that's the correct assumption). I've tried to overwrite and unset headers using mod_headers, but this doesn't help; whatever combination I try the cache is not impressed at all. I'm guessing that the order of operation is wrong, and headers are being rewritten after the cache sees them. early header processing doesn't help either. I've experimented with CacheQuickHandler Off and trying to set explicit filter chains, but nothing is helping. But I'm really mostly poking in the dark, as I do not have a lot of experience with configuring Apache filter chains.

Is there a straight forward solution for how to cache this obnoxious piece of code?

Best Answer

The empty Last-Modified: header makes the request uncacheable. So the first thing to do is to generate one, one solution is to use php auto_prepend_file:

Create a file prepend.php with a content like this:

  <?php header("Last-Modified:" . gmdate("D, d M Y H:i:s"), " GMT"); ?>

Then add in your vhost configuration the directive: php_value auto_prepend_file path_to_prepend.php

At this point you have to verify the server response has a correct Last-Modified: header. If not, we won't be able to cache it, if yes, maybe your work with mod_headers and mod_cache is working now?

If not, you can use squid and apache like this:

Apache Configuration

In your correct vhost, just enable mod_rewrite and use it to redirect the traffic you want to cache:

<VirtualHost your_current_virtual_host:80>
 ServerName your.site.com
 ..
 RewriteEngine on

 # This enables the caching server to see the request  as http://your.site.com/..
 ProxyPreserveHost on

 # This should be at VirtualHost level, not <Directory> or .htaccess


 # The DoSquid env variable decides if we send the request to the cache server
 # Adjust it for your needs
 RewriteRule /the/script.php - [E=DoSquid:Yes]

 # POSTs are not cacheable
 RewriteCond %{REQUEST_METHOD} ^POST$ 
 RewriteRule .* - [E:DoSquid:No]

 # Feel free to add any rule which makes sense for your needs

 # Requests from localhost are calls from the "primary" vhost ( see below )
 RewriteCond %{REMOTE_ADDR} ^127\.0\.0\.1$ [E:DoSquid:No]

 RewriteCond %{ENV:DoSquid} ^Yes$
 RewriteRule /the/script.php http://ip_of_caching_server/this/script.php [P,L,QSA]

 ..
 ..
<VirtualHost/>

# This VirtualHost will be accessed by your caching server as the primary server for your site
# Port 8009 can be anything, it just must be a separate virtual host

<VirtualHost your_current_virtual_host:8009>
 ServerName your.site.com
 ..
 RewriteEngine on

 # Here a make a massive usage of mod_headers in order to have a cacheable response
 # Needless to say, this might completely break your application. The responses are
 # Completely anonymized

 Header unset Set-Cookie
 Header unset Etag
 Header unset Pragma
 RequestHeader unset Cookie

 # Now fix the Cache-Control header..
 Header merge Cache-Control public
 # The max-age is a pain. We have to set one if it's not set, and we have to change it if it's 0
 Header merge Cache-Control "max-age=bidon"
 # Case when we have: Cache-Control max-age=.., ....
 Header edit  Cache-Control "^(.*)max-age=(.*)max-age=bidon, (.*)$" $1max-age=$2$3
 # Case when we have: Cache-Control yyy=bidon, max-age=.."
 Header edit  Cache-Control "^(.*)max-age=(.*), max-age=bidon$" $1max-age=$2
 # Now Replace the value if there was not a max-age, set to 10mn
 Header edit  Cache-Control "max-age=bidon" "max-age=600"
 # Now Replace the value if there was a max-age=0, set to 10mn
 Header edit  Cache-Control "max-age=0" "max-age=600"

 # Remove Cache-Control parameters which prevent caching
 Header edit Cache-Control "no-cache, " ""
 Header edit Cache-Control "no-store, " ""
 Header edit Cache-Control "post-check=0, " ""
 Header edit Cache-Control "pre-check=0, " ""
 Header edit Cache-Control "must-revalidate, " ""

 # The request is now forwarded to the first vhost. It will not loop because we do not cache requests from 127.0.0.1
 ProxyPreserveHost on
 RewriteRule ^(.*)$ http://127.0.0.1/$1 [P,L,QSA]

 ..
 ..
<VirtualHost/>

Cache server Configuration

You can probably use anything: squid, apache, varnish. with squid you have to configure it as a reverse proxy and declare cache_peer your.site.com parent 8009 0 no-query originserver .. Maybe you can just enable mod_cache in the second vhost to achieve what you want.

Related Topic