Whole-site delivery on CloudFront

amazon-cloudfrontcachecdn

After much searching, I can't seem to find the question to whether it's viable to cache my entire website in Cloudfront. (Static assets as well as the HTML returned on a dynamic request).

The Setup

My origin server is NOT an EC2 instance, but is on separate hosting.

I've setup a Cloudfront distribution to cache everything from my origin server example.com. I can access it via the generated url from Cloudfront abcxyz.cloudfront.net.

Set Cloudfront to be "in front" of my origin server?

What I need help understanding is if I can point my domain to cloudfront so example.com goes to the Cloudfront cache first (similar to how you would setup Varnish "in front of" your origin server).

In such a setup, what do I set my origin server as within Cloudfront? Setting it as example.com would lead to a circular reference where cloudfront attempts to check itself for a resource.

Should my origin server no longer be set to respond to example.com requests in this setup? (This allowing me to set my origin server in Cloudfront to something like "content.example.com", and responding to dynamic and static requests from there?)

Or is Cloudfront not ideal for a whole-site cache? Should I not attempt to serve the a dynamic response (The HTML output) from cache and only serve static assets (js, img, css, etc)?

Best Answer

After some more research, it seems that Cloudfront CAN cache your whole website, but whether you want to or not merits investigation. Here's hoping this is useful for any future passersby.

Whole Site Delivery

Here's some information on Whole-Site Delivery Warning: This slideshare is high-level - it doesn't get into details of implementation

In order to accomplish whole-site-delivery, you need to follow these general steps.

Let's assume you want to serve example.com and www.example.com via Cloudfront (you want Cloudfront to act as a site-wide cache, similar to how you might use Varnish).

Setup a distribution

Setup a distribution on Cloudfront for your domain.

  • I typically choose to let my origin server decide on cache settings via output headers
  • Set your origin to be something like content.example.com

Point your domain

1) Point your domains top-level (example.com and probably www.example.com) to your Cloudfront URL - it'll be something like abc123.cloudfront.net rather than an IP address

Note that this is a CNAME (abc123.cloudfront.net) rather than an A record (IP address). Whether your DNS allows you to set a CNAME over an A for the root domain I believe can vary between providers.

In fact, I think setting a CNAME for your root domain level is against RFC. This might restrict you to having to set the "www" version of your domain to Cloudfront, using Route 53, or using DNS Made Simple as this article suggests.

2) Set your DNS records so that something like content.example.com points to your origin server. This will provide a way for cloudfront to reach your origin server, but the public still use example.com and www.example.com to view site content

Caveats

There are a few caveats:

  1. Cookies - Whether or not the cache strips cookies is important. Request with cookies aren't typically cached (or more accurately, each unique cookie will create a different cached copy). Consider having Cloudfront ignore server-set cookies so it can cache content. This won't affect cookies added client-side from services such as Google Analytics or Disqus comments. It will affect server logic if you rely on separating cookies/session IDs from guests and authenticated users.

  2. Cloudfront supports GET and HEAD requests. POST and other HTTP verb request will result in error pages. This has implications if you allow users submit forms as well as ajax-requests.

I don't have public users on my site. I do have an admin area that I alone use. I therefore can enter my site's admin area via content.example.com directly rather than via the public example.com and www.example.com. This bypasses the cache altogether, eliminating the need for passing cookies through and allowing the use of any HTTP verb.

That happens to work for me, but I suspect that's not a great situation for most people. YMMV with Cloudfront and whole-site caching. It's still great for static asset caching.