Amazon EFS as code repository for auto-scaled EC2s

amazon-efsamazon-web-servicesautoscalingmagentorsync

tl;dr: I need to set up a fast automatic sync from EFS to multiple EC2s

I've set up an EC2 Auto-Scaling Group in AWS and I'm looking for the best way to manage code deployments to my instances, with as little service interruption as possible (preferably none) and as little scope for human error as possible (preferably none, hahaha)…

This is for a Magento website. I initially looked at storing all web content in EFS (Elastic File System), and having my EC2s mount it at boot, so there was simply one centralised codebase that they would each have access to. I quickly discovered that this was a Very Bad Idea – serving web content of a site the size of Magento over a network share is basically unworkable, and with the latency on EFS it's even worse than your average NFS share.

What I'm now trying to achieve is to have a centralised codebase in EFS, with close-to-real-time sync from there to a "local" (EBS) directory on each instance.

I tried rsync, using a "pull" approach, having each instance rsync files from EFS to itself. it was looking good at first, but it seems to gradually get slower with each scan (over an hour at last check).

I tried find, similar result.

I've experimented with fileconveyor's symlink_or_copy transporter, but that still seems slow – perhaps because for one reason or another it's failing to use inotify to discover changes and is falling back to polling.

Ultimately the goal is to allow a developer to deploy new and changed files to a single location and for those files to replicate quickly and automatically across all running instances. The developer shouldn't have to know or care how many instances are running – it's likely to vary on an hourly basis.

This answer to a similar question is pretty good, and is the approach I am currently using – one protected EC2 instance gets updated, new AMI gets created, remaining instances are killed and replacements booted up based on the new image. EFS basically becomes redundant.

But the manual intervention required is really a lot more hassle, and more prone to human error, than I can stick with long-term. I don't want to have to create a new AMI and Launch Configuration, and update the Auto Scaling Group to use that new LC, every time I do a deployment.

So… how do I sync quickly and automatically from EFS to multiple EC2s?

If I get fileconveyor working in tandem with inotify will that solve it? Or is that a wild goose chase, does anyone know?

Best Answer

Here's what I'd do:

  • Create a "golden image" AMI that has everything place as of now. Ideally that would be set up using a combination of CloudFormation and Opsworks.
  • Set up AWS Code Commit to store your source code
  • Set up AWS Code Deploy to deploy updated source code to your instances. This means you don't have to rebuild the AMI for every source code change, it's a simple deployment. Using the golden image rather than building from scratch you get the benefit of the new instance coming up quickly with only a small delay to update the code. This is fairly trivial update so could probably be done with EC2 User Data if you want to get it done quickly.
  • If you want to automate building test / pre-prod environments, testing, and production deployments (with optional manual approval) you could look at AWS Code Pipeline.
  • You can do blue / green (gradual) using Route53 / Nginx / HAProxy or red /black (cut-over) deployments using a variety of methods.

This stuff isn't rocket science to get working, but may take a bit of time if you're not familiar with it. Once you do the automation could save a fair amount of time for testing and deployments.

Related Topic