I've hit a few pages in Google where you can set the Header in S3 for individual objects. That's really not a productive way to do it specially since in my case we are talking of several objects.
Well, "productive" or not, that is how it actually is designed to work.
CloudFront does not add Cache-Control:
headers.
CloudFront passes-through (and also respects, unless otherwise configured) the Cache-Control:
headers provided by the origin server, which in this case is S3.
To get Cache-Control:
headers provided by S3 when an object is fetched, they must be provided when the object is uploaded into S3, or added to the object's metadata by a subsequent put+copy operation, which can be used to internally copy an object into itself in S3, modifying the metadata in the process. This is what the console does, behind the scenes, if you edit object metadata.
There is also (in case you are wondering) no global setting in S3 to force all objects in a bucket to return these headers -- it's a per-object attribute.
Update: Lambda@Edge is a new feature in CloudFront that allows you to fire triggers against requests and/or responses, between viewer and cache and/or cache and origin, running code written in Node.js against a simple request/response object structure exposed by CloudFront.
One of the main applications for this feature is manipulating headers... so while the above is still accurate -- CloudFront itself does not add Cache-Control
-- it is now possible for a Lambda function to add them to the response that is returned from CloudFront.
This example adds Cache-Control: public, max-age=86400
only if there is no Cache-Control
header already present on the response.
Using this code in an Origin Response trigger would cause it to fire every time CloudFront fetches an object from the origin, and modify the response before CloudFront caches it.
'use strict';
exports.handler = (event, context, callback) => {
const response = event.Records[0].cf.response;
if(!response.headers['cache-control'])
{
response.headers['cache-control'] = [{
key: 'Cache-Control',
value: 'public, max-age=86400'
}];
}
callback(null, response);
};
Update (2018-06-20): Recently, I submitted a feature request to the CloudFront team to allow configuration of static origin response headers as origin attributes, similar to the way static request headers can be added, now... but with a twist, allowing each header to be configured to be added conditionally (only if the origin didn't provide that header in the response) or unconditionally (adding the header and overwriting the header from then origin, if present).
With feature requests, you typically don't receive any confirmation of whether they are actually considering implementing the new feature... or even whether they might have already been working on it... it's just announced when they are done. So, I have no idea if these will be implemented. There is an argument to be made that since this capability is already available via Lambda@Edge, there's no need for it in the base functionality... but my counter-argument is that the base functionally is not feature-complete without the ability to do simple, static response header manipulation, and that if this is the only reason a trigger is needed, then requiring Lambda triggers is an unnecessary cost, financially and in added latency (even though neither is necessarily an outlandish cost).
If I understand your question correctly, you're wanting to forward all 404 errors to a specific HTML page, and return a 200 response rather than a 404 to the user.
This can easily be configured via CloudFront. Check out this blogpost on how to do it. Specifically, read the final section on CloudFront.
Basically, there's an area on the CloudFront configuration that allows you to do exactly what you're trying to do, and will allow you to map specific errors to files, and return whatever response code you want to.
In case you're using CloudFormation to outline your CloudFront distributions, and would like to update your CloudFormation templates to accomplish this, you would add the following under the DistributionConfig
property of the Resource:
CustomErrorResponses:
-
ErrorCode: 404
ResponseCode: 200
ResponsePagePath: /index.html
Best Answer
This is really bordering on "Do my system architecture" for you, but your four ideas are interesting case-studies in variable security, so let's run your options and see how they fare:
4. Checking referrer
The referrer is provided by the client. Trusting the client-provided authentication/authorization data pretty much voids security (I can just claim to have been sent from where you expect me to come from).
Verdict: TERRIBAD idea - trivial to bypass.
3. Download the files through our server
Not a bad idea, as long as you're willing to spend the bandwidth to make it happen, and your server is reliable.
Going on the assumption that you've already solved the security problem for your normal server/app, this is the most secure of the options you've presented.
Verdict: Good solution. Very secure, but possibly suboptimal if bandwidth is a factor.
2. Obfuscated URLs
Security Through Obscurity? Really? No.
I'm not even going to analyze it. Just no.
Verdict: If #4 was TERRIBAD this is TERRIWORSE because people don't even have to go through the effort of forging a referrer header. Guess the string and win
a prizeall the data!1. Generating (expiring) signed urls with PHP
This option has a pretty low suck quotient.
Anyone can click on the URL and snarf the data, which is a security no-no, but you mitigate this by making the link expire (as long as the link life is short enough the vulnerability window is small).
The URL expiring may inconvenience some users who want to hang on to the download link for a long time, or who don't get the link in a timely manner -- that's a bit of a User Experience suck, but it may be worth it.
Verdict: Not as good as #3, but if bandwidth is a major concern it's certainly better than #4 or #2.
What would I do?
Given these options, I would go with #3 -- Pass the files through your own front-end server, and authenticate the way your app normally does. Assuming your normal security is pretty decent this is the best option from a security standpoint.
Yes, this means more bandwidth use on your server, and more resources playing middleman -- but you can always just charge a tiny bit more for that.