Nginx – regex to match s3 urls in nginx location directive & proxy to amazon s3

amazon s3nginxreverse-proxy

In nginx location directive, how do I match s3 urls.

for example, the wrong url is:

http://example.com/https://s3.amazonaws.com/mybucket/logo.jpg?1404251306

from Logs, I can see nginx can that it able to service the request from this url & returns a 404.

xx.xx.xx.xx - - [15/Aug/2014:12:38:04 +0000] "GET /https://s3.amazonaws.com/mybucket/logo.jpg HTTP/1.1" 404 151 "-" "Mo

with this, I want to match this url, proxy the request to s3 & return logo.jpg. I have come up with something like this:

location ~* ^/https/(.*) {
  set $s3_host 's3.amazonaws.com';
  set $s3_bucket 'mybucket';

  proxy_set_header       Host $s3_host;
  proxy_set_header       Authorization '';
  proxy_hide_header      x-amz-id-2;
  proxy_hide_header      x-amz-request-id;
  proxy_hide_header      Set-Cookie;
  proxy_ignore_headers   "Set-Cookie";
  proxy_buffering        off;
  proxy_intercept_errors on;

  resolver               8.8.8.8 valid=300s;
  resolver_timeout       10s;

  proxy_pass http://$1;
}

Questions:

  1. What regex should I use in location directive so it matches amazon
    s3 urls ONLY.
  2. Currently, it handles any bucket. How do I restrict
    the bucket as well?

[Update]

I get the following error:

==> /var/log/nginx/error.log <==
2014/08/15 13:53:08 [error] 1579#0: *1 invalid port in upstream ":/s3.amazonaws.com/mybucket/logo.jpg", client: xx.xx.xx.xx, server: localhost, request: "GET /https://s3.amazonaws.com//mybucket/logo.jpg HTTP/1.1", host: "54.164.92.206"

Best Answer

Edit: (1) Sorry, some typo here (2) I adjust regex so it matches one or more slash before string mybucket like your log above.

Well, maybe you mean something like this

location ~* ^/https://s3\.amazonaws\.com/+mybucket(.*) {
  ...
  proxy_pass http://s3.amazonaws.com/mybucket$1;
}