CDN and DNS – Why Different IPs Are Delivered Based on Location

cdndomain-name-systemhttpnameserver

I found this explanation how a CDN works. But there is one thing I don't really understand. Let's assume I setup multiple DNS servers at my location and they use the nameserver domains dns1.example.com, dns2.example.com and dns3.example.com. This DNS servers are able to deliver a server IP depending on the visitors location (ping, geo database, browser language or whatever). Now I update this nameserver settings for my domain www.example.org at the registry.

Now, the very first request on www.example.org with an expired TTL tries to resolve the domain. It asks:

  1. the local .hosts/DNS, if TTL expired:
  2. the internet providers DNS, if TTL expired:
  3. the root DNS, if TTL expired:
  4. my local dns1.example.com

But if I understand it correct, the new IP is then added to all these nameserver caches until the TTL expires again. So how is it possible to send other IPs to the visitor depending on his location?

In this answer theandym said the request is "forwarded", but I don't think this is how a CDN works, because "forwarding" means lengthen the transmission way resulting a longer loading time. Or does a CDN require zero TTL for the domain?

Update1
Through this question I found Google's document describing how they optimized CDN performance. It did not explain how the CDN works in general, but there were interesting explanations like the following:

Thereafter, whenever a client attempts to fetch content hosted on the
CDN, the client is redirected to the node determined to have the least
latency to its prefix. This redirection however is based on the prefix
corresponding to the IP address of the DNS nameserver that resolves
the URL of the content on the client’s behalf, which is typically
co-located with the client.

This means Google checks at first the latency of all IP prefixes and defines a DNS resolution table (?) for all available prefixes. And if a visitor has the IP 198.51.100.231 the Google server IP is used, that is set for the prefix 198.51.100.0. But again: How does Google's DNS know which IP the visitor is using? Most visitors resolve Google's domain through their internet provider and by that the resolving is done through those external DNS servers or not?

As an additional example: If I start a DNS resolution for the domain facebook.com with different online tools (hosted in different countries) it is resolved to different IPs with different domains like:

  • 31.13.92.36 Reverse: edge-star-mini-shv-01-frt3.facebook.com
  • 31.13.76.68 Reverse: edge-star-mini-shv-01-sea1.facebook.com
  • 31.13.69.228 Reverse: edge-star-mini-shv-01-iad3.facebook.com
  • 157.240.2.35 Reverse: edge-star-mini-shv-01-ort2.facebook.com

After that I thought it could depend on the DNS server location used by the visitor, but I tried my own (Deutsche Telekom, Germany), Google's (8.8.8.8) and a major one from France (Orange) and they all returned for facebook.com the IP 31.13.92.36.

Best Answer

Ok it seems I can now give a rough answer to my own question. Anurag Bhatia says that there exist two methods how a CDN works:

DNS

Have DNS to do the magic i.e when users from network ISP A lookup for cdn.website.com, they should get a unicast IP address of Cache A in return, similarly for users coming from ISP B network, Cache B’s unicast IP should return.

Lets say we have a server with the IP 1.2.3.4 located in USA and a cache-server with the IP 2.3.4.5 located in Germany. Now a visitor tries to resolve the domain example.org. If he did not change his network settings he uses the DNS server of his internet service provider (ISP). And this ISP asks now dns1.example.com (the nameserver of the domain) for the IP. Now it depends on the location of the ISP. If its located in Germany the dns1.example.com returns 2.3.4.5 and if its located in the USA it returns 1.2.3.4.

But there might be a disadvantage with this method: Every time a user changed his network settings and uses an EDNS0 (see IETF draft) incompatible DNS provider (for example a corporate's central DNS server) the dns1.example.com will answer again with the nearest IP to those DNS locations, but this time the visitor is most likely in a different location causing a higher latency.

EDNS0 compatible DNS providers are passing information about the user to the authoritative DNS server. So the authoritative DNS server can respond with the IP next to the location of the user:

Today, if you’re using OpenDNS or Google Public DNS and visiting a website or using a service provided by one of the participating networks or CDNs in the Global Internet Speedup then a truncated version of your IP address will be added into the DNS request. The Internet service or CDN will use this truncated IP address to make a more informed decision in how it responds so that you can be connected to the most optimal server.

...

; EDNS: version: 0, flags:; udp: 512
; CLIENT-SUBNET: 130.89.89.0/24/21

Anycast

Have routing to route to nearest cache node based on “anycast routing” concept. Here Cache A, Cache B and Cache C will use same identical IP address and routing will take care of reaching the closest one.

I don't really understand Anycast because of BGP, etc., but I think the further explanation of Anurag Bhatia gives an idea how it could work:

  1. Optimization is based on BGP routing and announcement with little role of DNS.
  2. This setup is very hard to build up and scale since for anycast to work perfectly at global level, one needs lot’s and lot’s of peering and consistent transit providers at each location. If any of peers leaks a route to upstream or other peers, there can be lot of unexpected traffic on a given cluster due to break of anycast.
  3. This setup has no dependency on DNS recursor and hence Google DNS or OpenDNS works just fine.
  4. This saves a significant amount of IP addresses since same pools are used at multiple locations.

Anycast has also a disadvantage: Routing is flexible. While at the start of a TCP session the target node might be located in network A it may change to network B. Therefore Anycast will be used in practice for UDP only. UDP is a session-less protocol.

Most CDN are using DNS for HTTP/HTTPS traffic and Anycast for DNS requests.

Related Topic