Akamai vs smaller CDN for small/medium-size ecommerce traffic? (caching, latency, NetStorage)

cachecdnlatency

The small/medium sized online retail company I work for uses Akamai as our CDN for static images, but I'm wondering whether it might be hurting rather than helping and, if it's suboptimal, what we should be doing instead.

We get monthly traffic of about 3M pageviews and 400K unique visitors. We have 100k+ different static images that appear in our various web pages (several different images for each of several thousand products, etc).

The problem is that Akamai's servers are requesting files from the origin server (which we host ourselves) for about 40% of all browser requests. That means a lot of (in my view) unnecessary waiting for our customers: 40% of all requests have to make the round-trip between Akamai and our origin before returning to the customer.

Server TTLs aren't the issue; they're all set to 365 days. So it seems like either

  1. Akamai's edge servers aren't keeping our stuff in cache long enough before swapping it out in favor of content that gets higher traffic than ours, and/or
  2. there are so many Akamai edge servers (they claim 70K+ worldwide) that each server doesn't get enough traffic from our 450k monthly visitors to build up much of a cache of our files.

So I've started wondering whether we might be better served by a CDN with fewer servers, my thinking being that with fewer CDN servers, more of our images would be cached more often on each server, and would probably stay in cache a longer time without being swapped out. On the other hand, fewer servers probably means more latency for users who aren't close to one of the servers.

There are two Akamai-based options we're looking at but haven't pulled the trigger on (yet):

  • We haven't used their NetStorage service yet because there's a technical hurdle (which will be the topic of my next SF question if we go in that direction) and because 40% of the time there'd still be that extra round-trip between the edge server and the origin; it'd just be a round-trip inside Akamai's network instead of out to our separately hosted origin — probably faster, but still a round-trip.
  • We don't pay for Akamai's optional tiered distribution service. That would probably alleviate the problem to a large degree, but (1) it's not cheap, and (2) again, 40% of the time there'd still be a round-trip between the edge server and its tier hub.

So my questions are:

  1. Do y'all think it would be better to have the files cached on fewer servers, at the cost of the extra latency for some users; or is latency a bigger issue than the origin round-trips?

  2. If we go with NetStorage, does anyone have any insight as to how long round-trips to the NetStorage "origin" typically take?

  3. Am I missing anything? What else should I be thinking about here?

Best Answer

It's always better to base things on facts rather than supposition. Your hypothesis is that a CDN with fewer nodes would hit your origin less often. To test that, I would:

  1. Set up an account on multiple CDNs that do origin fetch but have a "super-POP" model. Amazon CloudFront, MaxCDN, and Voxel come to mind. CacheFly perhaps as well, although you cannot do origin fetch without a services engagement I think. LimeLight pioneered the "super-POP" model, but requires going through the sales song-and-dance to set up a test.
  2. Divert some statistically significant portion of your CDN traffic to those test CDNs. This could be as simple as a rewrite rule done n% of the time, but the "content mix" needs to be exactly the same for all CDNs. You can't restrict it to a subset of your static assets, since the number of static assets may also have an effect. Maybe keep 80% of traffic on Akamai for the test, and then do 5% for the other CDNs.
  3. Use your webserver logs to gauge how many origin hits you're seeing from each CDN versus the amount of traffic sent to each CDN.
  4. Use something like pingdom to measure worldwide response times for various static assets on each CDN. Offloading hits from your origin is one indicator of a CDN doing its job, but end-user latency matters too.

With these facts in hand, your decision will be easy. For what it's worth, I actually think you're right: Akamai might just have too many nodes for "smaller" sites.

Note that some of the smaller CDNs automatically use a tiered model for everything; they're just architected that way from the start. In such a model, a request to a particular edge node is routed to an "interior" node on a cache miss. The "interior" node then routes to the origin if it too has a cache miss. My own testing indicated MaxCDN worked this way (e.g. the first request for a file on the Amsterdam node did not always result in a corresponding origin fetch. So it must have gotten the file from somewhere else inside the MaxCDN network).

Related Topic