URL-based request rate limiting in Apache

apache-2.2rate-limiting

There are many rate limiting and QoS tools out there, but so far I have not been able to find one that meets my specific needs, as follows:

I have a web app running on Apache 2.2 and I want each of my customers to be limited to, say, 10000 requests per day. The limiting should not be based on the customer's IP address but on the customer's account, which can be extracted from the URL; e.g.: http://mywebsite.com/customer1/page1. Once they are over their limit, I would like requests from them to be "slowed down" to some predefined value, such as 15 requests per minute, perhaps by introducing a delay. If that's not possible, then returning an error like 503 Service Unavailable would be ok.

How can I do this with Apache? If there is no way to do it with Apache, then I'd be interested in suggestions for other tools, reverse proxies, etc.

UPDATE: The rate limiting needs to happen across a number of services (mod_dav_svn, mod_passenger, mod_wsgi) and must be high-performance.

Best Answer

I don't think you will be able to do this purely in Apache.

There are quite a few Apache modules that do bandwidth limiting but I don't think any of them will match what you have described.

mod_bw can limit bandwidth based on various parts of the request. It can limit based on IP address, file extension, mime type, file size and directory (based on my reading of the docs). You achieve the directory-based limiting by placing the module's directives inside a <Directory > block so I expect that they would work just as well inside a <Location > block. mod_bwshare seems to go inside <Directory > and <Location > blocks as well.

The configuration of these Apache modules go into your Apache configuration and they don't look up values from any external source, meaning you will have to restart Apache to change any of the values. If you want to only throttle the bandwidth once a customer has gone over their quota, you will need to edit the Apache config files and restart Apache every time any of them does so and do it again when their quota usage expires.

They way I would do this is to write a small application in whatever language you use that accepts a request for a file, checks in the database for whether the customer is over quota and then sends them the file at whatever rate is appropriate. The guy who wrote mod_bw explained how he achieved the throttling: he simply divides the file into small "chunks" (say 5KB per chunk) and then has a short sleep() in between echoing each chunk out to the client.

I would then use mod_rewrite to make translate the original file requests (that you said look like /customer1/page1) into something like /throttle.php?customer=customer1&file=page1.

The app can write to the database at the start of the download to indicate that it is currently processing a file and again at the end to indicate that it is now finished. This should enable you to stop one customer from hogging all your Apache children.