Trying to get the following behavior working in nginx
A default rate limit of 1r/s for each ip when using a browser.
A rate limit of 10r/s for bing and google spiders.
Reject bad bots.
Unfortunately google doesn't publish ip addresses for googlebot so I'm limited to useragent.
So far this gets close:
http {
# Rate limits
map $http_user_agent $uatype {
default 'user';
~*(google|bing|msnbot) 'okbot';
~*(slurp|nastybot) 'badbot';
}
limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
limit_req_zone $binary_remote_addr zone=two:10m rate=10r/s;
...
server {
...
location / {
if ($uatype == 'badbot) {
return 403;
}
limit_req zone=one burst=5 nodelay;
if ($uatype != 'user') {
limit_req zone=two burst=10 nodelay;
}
...
}
...
}
}
BUT – 'if' isn't allowed to do this.
$ nginx -t
nginx: [emerg] "limit_req" directive is not allowed here in /etc/nginx/nginx.conf
nginx: configuration file /etc/nginx/nginx.conf test failed
There are so many untested suggestions on nginx forums, most do not even pass configtest.
One that looks promising is Nginx Rate Limiting by Referrer? — Downside of that version is that all of the configuration is repeated for each different limit (I have many rewrite rules)
Anyone got something good?
Best Answer
Unfortunately you can't dynamize this way, limit request module doesn't support this.
The link you found is probably the only way to achieve this. Use
include
directive to "avoid" repeating your configuration.But what if a thirdparty crawler suddenly impersonate a goodbot user agent ?