Nginx – Scaling large file downloads

apache-2.2cachenginxsquidvarnish

We currently deliver large (1GB+) files via a single Apache server, but our Apache server is extremely disk-IO-bound and we need to scale.

My first idea was to simply duplicate this Apache server, however our file library is too big to simply horizontally scale the Apache server N-times.

So my next idea was to have two Apaches (highly-available) in the backend, each with a separate copy of our entire library.. then "N" reverse proxies in front, where "N" grows as our delivery needs grow. Each reverse proxy is very RAM heavy and has as many spindles per GB as possible. The backend Apache servers are more "archival" and low spindle-to-GB.

Is this a good architecture? Is there a better way to handle it?

Best Answer

This is not a bad architecture (squid is a popular reverse proxy), however if you expect exponential growth a Content Delivery Network could be a better solution - you only pay for content that you require, bandwidth scales instantly (you don't have to scale out more servers) and streaming transfer is geolocated to servers as close to the client as possible, ensuring they get maximum transfer speeds. However I've never tried this with files of 1GB, and the cost may be prohibitive.

Torrent technology can be considered as a p2p CDN in this case, and as such some of these providers may be suitable as torrent seeds for your content, reducing your total bandwidth costs and (possibly) increasing speed, although that's dependent on your leechers.