There is currently only one way to associate static IP addresses with Application Load Balancer (ALB) -- AWS Global Accelerator.
Static Anycast IPs – Global Accelerator uses Static IP addresses that serve as a fixed entry point to your applications hosted in any number of AWS Regions. These IP addresses are Anycast from AWS edge locations, meaning that these IP addresses are announced from multiple AWS edge locations, enabling traffic to ingress onto the AWS global network as close to your users as possible. You can associate these addresses to regional AWS resources or endpoints, such as Network Load Balancers, Application Load Balancers, and Elastic IP addresses. You don’t need to make any client-facing changes or update DNS records as you modify or replace endpoints.
https://aws.amazon.com/blogs/aws/new-aws-global-accelerator-for-availability-and-performance/
Global Accelerator allocates two static IPs from two Network Zones¹, and these are unique to your deployment -- not shared. These are advertised out to the Internet via peering connections at multiple locations on the AWS Edge Network (the same network where CloudFront, Route 53, and S3 Transfer Acceleration all operate -- it has more points of presence than just the AWS regions, and AWS-managed fiber connections to the regions). Then you associate the endpoints -- ALB, NLB, EIP, or EC2 Instance (without EIP) -- with the Global Accelerator instance, and traffic from the edge location where the requests arrive is NAT-ed to your balancer.
When Global Accelerator was initially launched, it relied on Source NAT to tie the global addresses to the VPC devices, so you couldn't use the client source IP or the X-Forwarded-For
header from the ALB to determine the client IP address in real-time; however, that has changed -- X-Forwarded-For
now correctly identifies the client IP address when an ALB is used with Global Accelerator in most AWS regions.
Client IP address preservation only works when the endpoint is an ALB or an EC2 instance (without EIP). It doesn't work with EIP endpoints or Network Load Balancers; for those cases you can only cross-correlate them later using flow logs, which capture source/destination tuples as well as the intermediate NAT address that your application will see.
Importantly, ALB is inbound only (connections are only ever established from outside to inside, regardless of the ultimate direction of data transfer), so if your servers are also initiating connections, you need a separate solution for a static source address -- a NAT Gateway.
One NAT Gateway per availability zone, placed on a public subnet, can serve as the default gateway for one or more private subnets within the availability zone, so that all instances on those subnets use the same source IP when contacting the Internet. NAT Gateway is not a black box in a physical place -- it's a feature of the network infrastructure, so it's intrinsically fail-safe and not considered a single point of failure within a single AZ. You can share a single NAT Gateway across availability zones, but then you do have a single point of failure if something catastrophic occurs in that one availability zone (and you'll pay slightly more to transport Internet traffic across AZ boundaries, compared to placing one NAT Gateway in each AZ). The NAT Gateway requires no application changes, because it isn't a proxy -- it's a network address translator that's transparent to the instances that are located on the subnets that are configured to use it. Each NAT Gateway has a static EIP.
¹ network zone is new AWS terminology, introduced with Global Accelerator. It describes the fact that the two IP addresses are, internally, handled by different infrastructure.
Best Answer
Network load balancing is the distribution of traffic based on network variables, such as IP address and destination ports. It is layer 4 (TCP) and below and is not designed to take into consideration anything at the application layer such as content type, cookie data, custom headers, user location, or the application behavior. It is context-less, caring only about the network-layer information contained within the packets it is directing this way and that.
Application load balancing is the distribution of requests based on multiple variables, from the network layer to the application layer. It is context-aware and can direct requests based on any single variable as easily as it can a combination of variables. Applications are load balanced based on their peculiar behavior and not solely on server (operating system or virtualization layer) information.
The difference between the two is important because network load balancing cannot assure availability of the application. This is because it bases its decisions solely on network and TCP-layer variables and has no awareness of the application at all. Generally a network load balancer will determine “availability” based on the ability of a server to respond to ICMP ping, or to correctly complete the three-way TCP handshake. An application load balancer goes much deeper, and is capable of determining availability based on not only a successful HTTP GET of a particular page but also the verification that the content is as was expected based on the input parameters.
This is also important to note when considering the deployment of multiple applications on the same host sharing IP addresses (virtual hosts in old skool speak). A network load balancer will not differentiate between Application A and Application B when checking availability (indeed it cannot unless ports are different) but an application load balancer will differentiate between the two applications by examining the application layer data available to it. This difference means that a network load balancer may end up sending requests to an application that has crashed or is offline, but an application load balancer will never make that same mistake.
Reference:
Network Load Balancing versus Application Load Balancing