CDN for caching REST api

amazon-web-servicesapache-2.2cachecdnreverse-proxy

I am doing some research on CDN providers, however I have a hard time finding out which ones are available, what exactly they offer, and if they are appropriate for my purpose. Hopefully you guys can give me some advice 🙂

We are hosting a public REST API on an Amazon EC2 instance. Every call is dynamic and CPU intensive but generally activity is fairly stable. However regularly there are short, high peaks of many uses simultaneously requesting the same resource(s). This happens for example after someone blogs or twitters a link to a resource, and everyone clicks it.

Many resources don't change frequently and my server is sending explicit Cache-Control max-age headers specifying the time that every resource could and should be cached. I need a web cache / reverse proxy / CDN that does a good job with inspecting cache-control headers and caching these server calls, so that if 1000 clients request the same resource within a minute, my server only has to serve it once, or at least not a 1000 times.

Furthermore, the CDN should be able to cache any HTTP GET request, regardless of the content type or URI. Limitations of the file size are no problem; output is generally brief and compact. I was experimenting with Cloudflare today; however they only cache static files based on the 'file extension' of the URI, which makes it completely useless for most REST api's. And last but not least, I'm a small startup so preferably something that is affordable and scales both up and down.

Which are providers that might fit these requirements? Thanks for any experiences/advice.

Best Answer

Any CDN which is capable of "origin fetch" (that's the CDN industry term, most of us would call it a reverse proxy) should do what you need. Amongst low-cost, pay-per-use CDNs, I know these have that feature:

Note that Rackspace Cloud Files, which uses Akamai as a CDN, only supports static origin files uploaded to their servers.

One sticking point may be a minimum cache lifetime. Rapidly churning content creates issues for CDNs, which are designed to serve static content. So if you set "Cache-Control: max-age=5", a particular CDN might change that to some minimum value like 3600, or not cache it at all and simply pass the request back to your origin.

If none of the pay-per-use CDNs offer cache lifetimes as short as you need, you may have to look at contracted CDN service. Or, your best option will be to set up Varnish or Nginx to do caching on one or more of your EC2 instances.