How to set up AWS Cloudfront for dynamic number of domains for a dynamic site


We have a webs/wix/etc-like website management system we are trying to use with CloudFront. It has the following domains and subdomains.
– ourdomain: our main website
– admin.ourdomain: the administration interface for every website, available through https
– images.ourdomain: an S3 bucket
– router.ourdomain: see below
– [customersomething].ourdomain: the subdomains for our free users
– []: domains for our premium customers

The system works in a way, where most of the domains are CNAME-d to router.ourdomain (because that was the easiest way for our customers with their different domain-registrars and such) and the router.ourdomain is A-aliased to our ELB, and then the PHP on our EC2-s are handling the sites based on the HTTP_HOST values, whilehe images coming from S3.

And now we want to put this whole stuff behind a CloudFront. Most of the stuff is trivial. It was easy to put the S3 behind CF. It was easy to put every ourdomain/*.(js|css) behind CF through an asset.ourdomain subdomain. It is a joy that we can use *.images.ourdomain to shard between subdomains on the fly decreasing clientside loadtime etc etc. We even could put every *.ourdomain stuff including the dyanamic PHP free sites behind CF by putting "*.ourdomain" into the CF params.

But the one thing we can't figure out is how to put the all the dynamically created PHP sites with custom domains behind CF. As a reminder: these are CNAME-d to router.ourdomain. Putting each domain into the CF params is not an option, as we need to be able to handle tens of thousands domain names, and we need to do that without manual configuration for each of them.

So our thought was that we should put router.ourdomain into the CF configuration as alternate domain name, point the router.ourdomain to the CF in route 53, and point the CF to our ELB as origin. Which we found out is a nice way to get this message every time: "ERROR The request could not be satisfied. Bad request. Generated by cloudfront (CloudFront)". Actually not every time, as the www.ourdomain works as it should (it is CNAMEd to router.ourdomain), but every other subdomain gives the error above (*.ourdomain is CNAMEd to router.ourdomain, but it goes the same for even those subdomains that are CNAMEd to router.ourdomain one by one, except the www, and of course the router). So right now we not just don't have any idea how should we solve this, we don't even understand, why does it work for the www if it doesn't work for every other or vice versa.

Any thoughts and ideas would be appreciated, thanks.

Best Answer

As a reminder: these are CNAME-d to router.ourdomain

Yes, you mentioned that. Here's the thing about that:

It doesn't matter.

Yes, CloudFront refers to alternate hostnames as CNAMEs. Yes, a CNAME DNS record is the typical way you would route a given site'a traffic to hit CloudFront. But no, having configured a hostname as a CNAME pointing to your CloudFront distribution is not relevant to now Cloudfront, or HTTP in general, works.

When a browser wants to connect to a web server, it looks up the IP address from the DNS. If there's a CNAME in the path, that information is discarded. All the browser cares about is "what IP address do I connect to?"

Let's say was a CNAME to The browser looks up and ends up with the IP address for

With the answer in hand, the browser establishes a connection to the web server and sends a request.

GET / HTTP/1.1

The Host: header in the http request sent by the browser contains the hostname as shown in the address bar. The CNAME info is completely unavailable and unknown to either the browser or the web server.

So how would the CNAME be relevant to request parsing and layer 7 routing? It can't be. The CNAME target is only used as a path to find the IP address to connect with.

Your distribution is not the only one using those IP addresses at Cloudfront. There are hundreds or thousands of others. It comes down to the Host: header.

The list of "CNAMEs" (alternate hostnames) is a set of hostnames for CloudFront to match, in the incoming Host: header, to determine that a request should be considered as belonging to your distribution. It's a list of Host: headers a browser may send. Until the Host: header is matched up with a configured value in a distribution, the distribution associated with that request is unknown and undefined.

What does Cloudfront do if it can't match an incoming Host: header with any distribution?

HTTP/1.1 400 Bad Request

So, the behavior you see is expected, and correct. Wildcards do work, in the CloudFront configuration, as long as they make up the sole leftmost element in the alternate hostname configuration line. But this has nothing to do with the DNS.

You cannot avoid configuring other, customer-owned domain names in CloudFront if you want them to ride your distribution.

However, you can modify them programmatically through the API, instead of manually:

There is a limit of 100 such aliases per distribution, but you can request an increase by submitting a form to AWS support. In truth, though, you'd be just as well served to break these up into multiple distributions, because CloudFront won't cache the object at a given path with one host in the request esders and return it as a cached response for a different host, even in the same distribution, if you are forwarding the host header to the origin server (which you have to do, if the origin server is to be able to tell the difference). It can't, because a significant parameter in the request has changed from one request to the other.

Related Topic