Storing and serving files securely for multiple clients

amazon s3

We are working on a web app, where (among other features) our users can upload their files.
However we can't store these files on our VPS because storage space is limited, so we decided to go with S3.

The main problem is that we must make sure users can access only their own data.
So we keep the list of files in our database, and the list of users having access to them.
Our server can easily decide if a user has, or not, access to a file. But how to actually serve the files to users?

There are some possibilities I've already considered, however none of them actually seems to be the best one.

1. Generating (expiring) signed urls with PHP

This is a really simple approach, it is also fast but results in very very ugly and long urls.

Here's how to do it.

2. Obfuscated URLs

This means, that we keep the files public for read on S3, but all the files are stored in hard to guess named folders like: 24fa0b8ef0ebb6e99c64be8092d3ede20000. However, maybe this is not the safest way to go. Even if you can never guess a folder name, after you know it (because you actually have access to it), you can share that link with anybody (with any not authorized person).

3. Download the files through our server

This means that the files are not served directly by S3, but first our server reads it securely and serves it. We really don't want this 🙂

4. Checking referrer

The Obfuscated URLs solution can be improved by "making sure" the request comes from our server (you can set up S3 to check the referrer). However this would be a very unreliable solution, because not all browsers send the referrer data, and it can also be faked.

What is a good way to serve files from Amazon S3 securely for different clients?

Best Answer

This is really bordering on "Do my system architecture" for you, but your four ideas are interesting case-studies in variable security, so let's run your options and see how they fare:

4. Checking referrer

The referrer is provided by the client. Trusting the client-provided authentication/authorization data pretty much voids security (I can just claim to have been sent from where you expect me to come from).
Verdict: TERRIBAD idea - trivial to bypass.

3. Download the files through our server

Not a bad idea, as long as you're willing to spend the bandwidth to make it happen, and your server is reliable.
Going on the assumption that you've already solved the security problem for your normal server/app, this is the most secure of the options you've presented.
Verdict: Good solution. Very secure, but possibly suboptimal if bandwidth is a factor.

2. Obfuscated URLs

Security Through Obscurity? Really? No.
I'm not even going to analyze it. Just no.
Verdict: If #4 was TERRIBAD this is TERRIWORSE because people don't even have to go through the effort of forging a referrer header. Guess the string and win ~~a prize~~all the data!

1. Generating (expiring) signed urls with PHP

This option has a pretty low suck quotient.
Anyone can click on the URL and snarf the data, which is a security no-no, but you mitigate this by making the link expire (as long as the link life is short enough the vulnerability window is small).
The URL expiring may inconvenience some users who want to hang on to the download link for a long time, or who don't get the link in a timely manner -- that's a bit of a User Experience suck, but it may be worth it.
Verdict: Not as good as #3, but if bandwidth is a major concern it's certainly better than #4 or #2.

What would I do?

Given these options, I would go with #3 -- Pass the files through your own front-end server, and authenticate the way your app normally does. Assuming your normal security is pretty decent this is the best option from a security standpoint.
Yes, this means more bandwidth use on your server, and more resources playing middleman -- but you can always just charge a tiny bit more for that.

Related Solutions

AWS CloudFront – No Cache-Control Header for Files from S3 Origin

I've hit a few pages in Google where you can set the Header in S3 for individual objects. That's really not a productive way to do it specially since in my case we are talking of several objects.

Well, "productive" or not, that is how it actually is designed to work.

CloudFront does not add Cache-Control: headers.

CloudFront passes-through (and also respects, unless otherwise configured) the Cache-Control: headers provided by the origin server, which in this case is S3.

To get Cache-Control: headers provided by S3 when an object is fetched, they must be provided when the object is uploaded into S3, or added to the object's metadata by a subsequent put+copy operation, which can be used to internally copy an object into itself in S3, modifying the metadata in the process. This is what the console does, behind the scenes, if you edit object metadata.

There is also (in case you are wondering) no global setting in S3 to force all objects in a bucket to return these headers -- it's a per-object attribute.

Update: Lambda@Edge is a new feature in CloudFront that allows you to fire triggers against requests and/or responses, between viewer and cache and/or cache and origin, running code written in Node.js against a simple request/response object structure exposed by CloudFront.

One of the main applications for this feature is manipulating headers... so while the above is still accurate -- CloudFront itself does not add Cache-Control -- it is now possible for a Lambda function to add them to the response that is returned from CloudFront.

This example adds Cache-Control: public, max-age=86400 only if there is no Cache-Control header already present on the response.

Using this code in an Origin Response trigger would cause it to fire every time CloudFront fetches an object from the origin, and modify the response before CloudFront caches it.

'use strict';

exports.handler = (event, context, callback) => {
    const response = event.Records[0].cf.response;

    if(!response.headers['cache-control'])
    {
        response.headers['cache-control'] = [{ 
            key:   'Cache-Control', 
            value: 'public, max-age=86400' 
        }];
    }

    callback(null, response);
};

Update (2018-06-20): Recently, I submitted a feature request to the CloudFront team to allow configuration of static origin response headers as origin attributes, similar to the way static request headers can be added, now... but with a twist, allowing each header to be configured to be added conditionally (only if the origin didn't provide that header in the response) or unconditionally (adding the header and overwriting the header from then origin, if present).

With feature requests, you typically don't receive any confirmation of whether they are actually considering implementing the new feature... or even whether they might have already been working on it... it's just announced when they are done. So, I have no idea if these will be implemented. There is an argument to be made that since this capability is already available via Lambda@Edge, there's no need for it in the base functionality... but my counter-argument is that the base functionally is not feature-complete without the ability to do simple, static response header manipulation, and that if this is the only reason a trigger is needed, then requiring Lambda triggers is an unnecessary cost, financially and in added latency (even though neither is necessarily an outlandish cost).

Serving a React app with s3 and Cloudfront

If I understand your question correctly, you're wanting to forward all 404 errors to a specific HTML page, and return a 200 response rather than a 404 to the user.

This can easily be configured via CloudFront. Check out this blogpost on how to do it. Specifically, read the final section on CloudFront.

Basically, there's an area on the CloudFront configuration that allows you to do exactly what you're trying to do, and will allow you to map specific errors to files, and return whatever response code you want to.

In case you're using CloudFormation to outline your CloudFront distributions, and would like to update your CloudFormation templates to accomplish this, you would add the following under the DistributionConfig property of the Resource:

CustomErrorResponses:
  -
    ErrorCode: 404
    ResponseCode: 200
    ResponsePagePath: /index.html

1. Generating (expiring) signed urls with PHP

2. Obfuscated URLs

3. Download the files through our server

4. Checking referrer

Best Answer

4. Checking referrer

3. Download the files through our server

2. Obfuscated URLs

1. Generating (expiring) signed urls with PHP

What would I do?

Related Solutions

AWS CloudFront – No Cache-Control Header for Files from S3 Origin

Serving a React app with s3 and Cloudfront

Related Topic