Instead of whitelisting S3 traffic at the NAT instance, I suggest you configure VPC endpoints for S3:
That way, your EC2 nodes can access S3 directly on the VPC private network instead of going through the NAT.
If you still want a list of IPs used by S3, you can use the AWS CLI command describe-prefix-lists:
aws ec2 describe-prefix-lists
which will give an output like
{
"PrefixLists": [
{
"PrefixListName": "com.amazonaws.eu-west-1.s3",
"Cidrs": [
"54.231.128.0/19"
],
"PrefixListId": "pl-6da54004"
}
]
}
The list is for the region used by AWS CLI. My sample shows the current output for eu-west-1
, You can specify a different region by passing the --region
parameter, e.g. aws --region us-east-1 ec2 describe-prefix-lists
.
However, please note that the IP range for a service may change from time to time.
First, a bit of background. The DNS resolver for VPC instances is a virtual component that is built in to the infrastructure. It's immune to the outbound security group rules... but the resolution of the hostnames for S3 endpoints doesn't change when you provision an S3 endpoint for your VPC.
What a VPC endpoint for S3 does is a couple of different things. Understanding what those things are is key to understanding whether it will do what you need. tl;dr: it will, in this case.
First, you notice they are configured in the route tables as "prefix lists." A VPC endpoint takes a set of predefined IPv4 network prefixes, and hijacks the routes to those prefixes for every route table that includes the respective prefix list so that your traffic to any of those networks will traverse the VPC endpoint instead of the Internet Gateway and any intermediate NAT instance.
In essence, this opens a new path out from your VPC to the AWS service's IP address ranges... but where those IP addresses take you is not, initially, and he same place as they would take you without the VPC endpoint in place.
The first place you hit looks just like S3 but it isn't identical to the Internet-facing S3, because it knows about your VPC endpoint's policies, so that you can control which buckets and actions are accessible. These do not override the other policies, they augment them.
An endpoint policy does not override or replace IAM user policies or S3 bucket policies. It is a separate policy for controlling access from the endpoint to the specified service. However, all types of policies — IAM user policies, endpoint policies, S3 policies, and Amazon S3 ACL policies (if any) — must grant the necessary permissions for access to Amazon S3 to succeed.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints.html#vpc-endpoints-access
Note that if you do not restrict bucket access with an appropriate policy, and instead enable full access, the instances will be able to access any bucket in the S3 region if the bucket's policies allow it, including public buckets.
Now, the tricky part. If your instance's security group doesn't allow access outbound to S3 because the default "allow" rule has been removed, you can allow the instance to access S3 via the VPC endpoint, with a specially-crafted security group rule:
Add a new outbound rule to the security group. For the "type," choose HTTPS. For the destination, choose "Custom IP."
The documentation is not consistent with what I see in the console:
The Destination list displays the prefix list IDs and names for the available AWS services.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints.html#vpc-endpoints-security
Well... no, it doesn't. Not for me, least, not as of this writing.
The solution is to choose "Custom IP" and then, instead of an IP address block or security group ID, type the prefix list id for your VPC endpoint, in the form of pl-xxxxxxxx
in the box for the IP address. You can find this in the VPC console, by looking at the destinations in one of the subnets associated with the VPC endpoint.
Best Answer
The 403s are expected as directory indexing is not allowed in S3.
For the 404s you may be doing something wrong. For example, http://us-east-1.ec2.archive.ubuntu.com.s3.amazonaws.com/ubuntu/dists/lucid/Release works for me.