Nat – Does the ELB also route outbound reply traffic in AWS

amazon-vpcamazon-web-servicesnat;

I have been trying to understand how routing works in an AWS VPC with public/private subnets.

I have a setup as recommended by amazon with an ELB and NAT in the public subnet and the webserver in the private subnet. I have security groups (SG) configured as per http://blogs.aws.amazon.com/security/blog/tag/NAT and it all works as expected. Great!

Reference architecture with Amazon VPC configuration

What I do not yet understand is how HTTP replies are returned from the webserver instance in the above architecture.

So a web request comes in from the public internet over HTTP,80 hits ELB and ELB takes it to the private IP of the webserver, cool. Now the webserver has to reply. From what I understand the reply will be over a different higher TCP port (1024-65535). The NAT SG only allows outbound traffic over ports 80 & 443. So how does this reply get out back to the public Internet. It cannot go through the NAT. Does this mean the reply goes back out through the ELB. The Amazon diagram does not indicate the ELB traffic direction arrow as bidirectional, nor does the ELB documentation state that the ELB behaves like a stateful NAT. Does it?

Best Answer

The arrows in the diagram only indicate the direction of connection establishment -- not traffic flow.

Yes, return traffic goes back through the ELB.

But, it isn't a stateful NAT -- it's a TCP connection proxy. The ELB machines accept TCP connections on the configured listening ports, terminating the SSL session if so configured, and establish a new TCP connection to the back-end server. If the listener is configured for HTTP, the ELB operates in a payload-aware mode parsing, logging, and forwarding HTTP requests to the back-end, otherwise it's payload-agnostic, establishing a new TCP connection 1:1 to the back-end for each incoming connection, and "tying the pipes together" (with no HTTP-level awareness or modification).

Either way, the source address of the incoming connection to your application is going to be that of the ELB node, not the original client. This is how the response traffic returns to the ELB for return to the client.

In http mode, the ELB adds (or appends to) the X-Forwarded-For header so your application can identify the original client IP, as well as X-Forwarded-Proto: [ http | https ] to indicate whether the client connection uses SSL and X-Forwarded-Port to indicate the front-end port.


Update: the above refers to a type of load balancer that is now known as "ELB Classic" or ELB/1.0 (found in the user agent string it sends with HTTP health checks).

The newer Layer 7 balancer, Application Load Balancer or ELB/2.0 operates similarly, with respect to traffic flow. The Layer 4 ("transparent" TCP) capability is removed from ALB and layer 7 features enhanced significantly.

The newest type of load balancer, the Network Load Balancer, is a Layer 3 balancer. Unlike the other two, it behaves very much like dynamic NAT, handling inbound (outside-originated) connections only, mapping source-addr+port through EIP-addr+port to instance-private-ip:adde+port -- with the EIP bound to the "balancer" -- and unlike the other two types of balancers, the instances need to be on public subnets, and use their own public IPs for this.

Conceptually speaking, the Network Load Balancer seems to actually modify the behavior of the Internet Gateway -- which is, itself, a logical object that cannot be disabled, replaced, or experience a failure in any meaningful sense. This is in contrast to ELB and ALB, which actually operate on "hidden" EC2 instances. NLB operates on the network infrastructure, itself, by all appearances.