AWS Security Group – Troubleshooting Private IP Access Issues for EC2 Instances

amazon ec2amazon-vpcamazon-web-services

I have to EC2 instances. They're both in the same VPC, and they both have public IPs assigned to them.

My problem is that I'm required to use the public IPs in my Security Groups to allow them to communicate. If I try to use the private IPs, their connections are being denied.

I will eventually remove their public IPs, and would like to not have to change my security group settings afterwards.

Why wouldn't I be able to use private IPs as the source for two machines both in the same VPC?

Best Answer

tl;dr: when you connect to an instance using its public IP address, you are necessarily using your source machine's public address as the source address, too (or that of its NAT Gateway, if the source instance has no publig IP), and your traffic is going out to the Internet and back in when you do this (although you are admittedly not going very far out toward the Internet).

Let's take two example instances:

#1 public 203.0.113.1 private 172.31.3.1
#2 public 203.0.113.2 private 172.31.3.2

If instance #1 connects to #2 using instance #2's public IP 203.0.113.2:

  • the IP packets leave instance #1 with source IP 172.31.3.1 and destination address 203.0.113.2.
  • the packets arrive at the VPC route table, which sees that 203.0.113.2 it is not an IP address inside the VPC, so the route table sends it to the Internet Gateway
  • the Internet Gateway sees traffic with a source address of 172.31.3.1 which it knows is actually an EC2 instance with public IP 203.0.113.1 so it rewrites the source IP address from 172.31.3.1 to 203.0.113.1 and sends it out toward the Internet.
  • the source IP is now 203.0.113.1 and the destination IP is 203.0.113.2.
  • an unnamed component (possibly this is withim the Internet Gateway itself but it might be between the Internet Gateway and a rarely-discussed piece of regional AWS infrastructure called the transit center -- of which each region has at least two) sees that 203.0.113.2 belongs to your Internet Gateway the traffic is hairpinned back to your Internet Gateway for handling.
  • the Internet Gateway knows that 203.0.113.2 belongs to your instance #2 so it translates the destination address to 172.31.3.2; at this point the source address is still, and will remain, 203.0.113.1.
  • Instance #2 and its security group sees this traffic as coming in from the Internet, with source IP 203.0.113.1, so this is the address that must be allowed by the security group.

If instance #1 uses 172.31.3.2 to connect to instance #2 then essentially none of this happens, so the security group for #2 will see 172.31.3.1 as the source address.

Using the private IP of the target instance is the way to go.

Note that when using private IPs you can also list instance #1's security group ID in instance #2's security group rules instead of listing instance #1's private IP address. It goes in the same place as the IP in the console -- type sg in that box and you should be able to select one. This is probably easier to maintain and works well if you put instance #1 in an auto-scaling "group of one" -- so that autoscaling replaces the machine if it fails.

Note also that when two instances communicate using their public IPs, you're billed for going out and coming back in. It's not as much as Internet traffic costs, but the rate is similar to the rates for for intra-region inter-AZ and intra-region VPC peering traffic.