Way to make GoogleImageProxy honor Cache-Control / expiration headers

apache-2.2cachehttp-headers

I am suddenly getting flooded with requests from Google's image proxy servers that look like the following:

66.249.81.250 - - [04/May/2015:06:55:54 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.93.170 - - [04/May/2015:06:56:31 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.93.170 - - [04/May/2015:06:56:31 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.83.202 - - [04/May/2015:06:56:44 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
64.233.173.224 - - [04/May/2015:06:56:45 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.81.244 - - [04/May/2015:06:56:49 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.83.196 - - [04/May/2015:06:57:19 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
64.233.173.218 - - [04/May/2015:06:57:27 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.83.208 - - [04/May/2015:06:57:30 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.88.250 - - [04/May/2015:06:57:32 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
66.249.88.252 - - [04/May/2015:06:57:32 +0000] "GET /images/image_1.jpg HTTP/1.1" 200 93394 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"

As you can see, my server (Apache 2.2.22) is responding with the full 200 and resending the image for every request. When I make the same request in a browser, I get a 304 response and the following headers:

Cache-Control:max-age=5184000
Date:Mon, 04 May 2015 06:43:00 GMT
Expires:Fri, 03 Jul 2015 06:43:00 GMT

Is there some reason that the Google image proxy is not honoring these and is there anything I can do about it beyond turning on something like Cloudflare and hoping for the best? I understand from…

Apache logs flooded with connections – "(via ggpht.com GoogleImageProxy)"

… that this is "normal" traffic but I'm not happy about having to re-serve the entire 100kb file every time.

Best Answer

The Goolge image proxy is caching.
You can easily test this by embedding an image into an e-mail and send it to an Gmail account you have control over.
Reload the page a few times (with dropped cache of course).
The cache gets hit, your server will not receive any request.

It's unclear if the same cached URL/file gets served to multiple users.
Maybe this is what you are seeing.

But anyway: If you send an e-mail to 10 people and embedded an image into it, you should expect your server having to serve 10 requests.
I don't see how this is flooding.

Related Topic