Nat – How to get AWS VPC EC2 instances to be able to see the AWS APIs

amazon-vpcamazon-web-servicesnat;PROXY

We're spinning up infrastructure inside of an AWS VPC via CloudFormation.

We're using auto-scaling groups to bring up VPC-EC2 instances (so, we don't bring up instances directly; ASGs manage that).

Inside of a PVC, EC2 instances only have a private IP; they cannot see the outside world without further work.

When these instances spin up, we have some bootstrap tasks that require talking to the various AWS APIs. We also have some ongoing tasks that require AWS API traffic.

How are you tackling this apparent chicken-egg problem?

We've read about:

  • NAT instances – but don't like this so much because it's another layer to our stack.
  • assigning elastic-IPs to each VPC instance that needs to talk – but a) they all do, and b) since we're using ASGs, we don't know which instances to assign EIPs to at provision-time, and c) we'd need to set up something to monitor those ASGs and assign EIPs when instances are terminated and replaced
  • spinning up an instance (actually, a load-balanced pair, probably spanning AZs) to act as an AWS-API proxy for all API traffic

I guess I'm wondering whether there's some kind of back-door we can open that allows our VPC EC2 instances access to the AWS API endpoints, but nothing else, for cheap-complexity setup, that doesn't add another network-hop layer to our infrastructure for serving requests.

Best Answer

You have covered the main ways to get a VPC instance in a private subnet to talk to the outside world.

  1. Have the Internet traffic for the private subnet be routed out of a VPN tunnel connected to your office, which can then provide access to the rest of the internet. Not ideal since it requires an always on VPN tunnel and an extra hop through your office.

I would suggest using NAT instances, this is the recommended setup for getting Internet access to machines inside private subnets. They are configured per subnet, so your machines do not need to have any knowledge of their configuration when being launched. Just be sure to use an m1.large or larger instance to get the higher network throughput (vs m1.small)