I am fairly new to caching with Varnish, but here is what I've learned so far: There are several factors to consider when using Varnish for caching against an application.
In your case, know what cookies are being set and for what purpose. If varnish sees a cookie with your request, you will be passed to the backend, resulting in a cache miss.
Google Analytics Cookies
If you are using Google Analytics cookies, you can safely unset them in Varnish; don't worry, you will still the data in your GA reports. Use something like this in your vcl_recv
set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|__utma_a2a)=[^;]*", "");
You can try a couple more cleanup lines, also in the vcl_recv
Remove ";" prefixes from Cookies
set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
Unset empty cookies
if (req.http.Cookie ~ "^\s*$") {
unset req.http.Cookie;
}
Application Specific Cookies
If your application sets a cookie when a user logs in to perform a function, those requests should not be cached and sent to the backend directly. Otherwise, you could cached pages viewed by logged in users (bad).
Use something like this:
if (req.http.Authorization || req.http.Cookie) {
return (pass);
}
HTH & good luck.
Edit
Use this to see what cookies varnish is seeing come through:
varnishtop -i RxHeader -I Cookie
If you're regex misses any, catch 'em here!
By default Varnish doesn't cache requests with a Cookie header:
http://varnish-cache.org/svn/trunk/varnish-cache/bin/varnishd/default.vcl
sub vcl_recv {
(...)
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (lookup);
You need to code the behaviour you want into the configuration. Be aware that the Cookie is part of the client request, not the "page" (object, really). The "page" (object) comes with a "Set-Cookie" header - that's the one that will be cached.
Also, "Vary: Cookie" doesn't mean "do not cache". It means cache one object for every value of Cookie received.
If your application doesn't generate any content based on Cookie, it's probably safe to ignore it:
- if (req.http.Authorization || req.http.Cookie) {
+ if (req.http.Authorization) {
Do some tests and you will get the hang of it. Hope this helps.
Best Answer
This is not stripping any cookie, but rather regsubing a lot of uri extensions/parameters (like ver=somethingsomething). Personally I think that if you didn't intentionally write this, then don't use it.
Regarding the question about what impact removing the google __utm* cookies will have on analytics then. You link to some external js script, client fetches it and google issues a Set-Cookie that matches your domain. Next request the user does to YOU contains this Cookie, and thus prevents you from using an user independent cache. Thus, you remove this cookie on YOUR side. Google analytics is not affected as the google .js you serve is not able to read the headers on server side, but rather on client side, so in other words they have no function for your site. Analytics got their information when the client requested the .js file. You should obviously not issue any cookies with conflicting names as that could potentially cause problems.
I basically use the example on varnish-cache.org: