Is it possible to upload files directly to Amazon S3 and edit them

amazon ec2amazon s3amazon-web-servicesupload

I'm facing this problem, I wish to let users to upload their files directly to my bucket on Amazon S3. This step can be accomplished easily like described here.

But what if I need to edit each file with FFMPEG?

Eg.

shell_exec('ffmpeg -i /sample.wav -acodec libmp3lame /sample.mp3');

Currently first I store files to my server, run shell commands, then I use putObject to send them to S3, but Is there a better way to accomplished this, leaving my server quiet?

My specs:

  • EC2 with LAMP installation running Ubuntu 14.04
  • AWS SDK PHP

Best Answer

My understanding of your solution leads me to think that you wish to use the same ubuntu server you already have to also perform the transcoding of your uploaded media files.

The problem is objects stored in S3 are not accessible like files on a normal file system, you could download the file from S3 and process it on the web server instance however this would be a complicated and potentially inefficient setup.

It would be better to decouple the transcoding process from your ubuntu server and let it happen in true cloud style. The best solution for this is S3 events and Lambda.

So based on my understanding of your use case here is what I would recommend for your solution:

  1. Create a bucket with the appropriate permissions for receiving files (best not to make this public or your bills could get quite expensive)
  2. Create an S3 event to trigger Lambda when a object is created / put / updated (depending on what you wish to trigger off and how files are placed on S3)
  3. Use lambda to process your file, or have Lambda start an instance that has a UserData script that will do the work.

Regarding the instance's UserData, here is the steps you may wish to have it undertake:

  1. download ffmpeg
  2. install ffmpeg
  3. download from s3 the file (you can pass this argument from lambda)
  4. process the file using ffmpeg
  5. upload the file to s3
  6. use the ec2 metadata service to lookup the instances own instance id
  7. run the aws cli command for terminating an instance supplying the resolved instance id.

You could be better off creating an AMI that already has ffmpeg installed but you would need to decide if its cheaper to spend 5 minutes syspreping each instance or paying for an AMI to always be on hand. I would say though unless your processing takes longer than an hour or your use case need the file to be returned ASAP, you would be better off installing ffmpeg each time as AWS bills for full hours even if you only use 15 minutes.

Recommendations for this approach:

Chances are you may wish to undertake further activities when the newly processed file is created, so why not make use of s3 events to fire off another Lambda process? :)

Also to help keep things clean, if your solution allows for it, try to upload your created files to s3 under a different key path to where you place your uploaded files.

Alternate Option: Use Elastic Transcoder

An alternate option is to make use of the AWS Elastic Transcoder service. You would send jobs to the transcoder in the same way by triggering lambda when a S3 bucket is updated and having it process the files in that bucket. Elastic Transcoder can then notify an SNS queue which then can trigger an email or another Lambda query to deal with the created file.

Elastic Transcoder would be a cooler and likely better approach but it will require a bit more work.

Elastic Transcoder requires you to create a pipeline, and then create a job using that pipeline. I will link you the JavaScript SDK for each.

CreatePipline http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/ElasticTranscoder.html#createPipeline-property

CreateJob http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/ElasticTranscoder.html#createJob-property

An Alternate Alternate Solution: Use Amazon SQS

If you want the processing to occur on the ubuntu instance rather than spinning up another instance from Lambda, you could use S3 events to trigger lambda to publish a job to Amazon SQS.

Amazon SQS will then let an agent of your own creation poll Amazon SQS for jobs which will then let you reference a file in s3 needing transcoding. This is rather complicated though and I only include it for completeness and unless you really do need to perform this work on the ubuntu instance you already have.

Related Topic