I use Apache and mod_jk for this, it works fine. What would be the benefit of switching apache to lighttpd?
If it ain't broke, don't fix it. The benefits of switching to Lighttpd are mostly performance; Lighttpd requires less CPU/RAM to do the same work as Apache. It can also be easier to set up, but since you already have Apache running, that is a non-issue for you.
I disagree with Andy on the security issue; Apache 2 has had more reported security issues than Lighttpd, but most of them would be in modules you wouldn't compile in for load balancing, and Apache is good at quickly releasing fixes for their security issues. Lighttpd is getting much less security scrutiny than Apache, so it may have more un-publicized issues that we don't know of. It is an apples to oranges comparison...
Today Lighttpd is loosing momentum IMHO. Since its author landed a job at MySQL, and started working on MySQL proxy, the frequency of releases of Lighttpd has gone down. My gut feeling is that most new installations of event-driven open source HTTP servers are using nginx now. See the english language wiki for an overview of nginx.
For gratis open-source load balancing, I believe the largest installed base is for HAProxy and nginx now. It is hard to come up with numbers, as public surveys such as Netcraft cannot detect backend load balancers, but this is my gut feeling based on the blog posts that I see.
Note that both nginx and HAProxy cannot do Apache JServ Protocol proxying. nginx can do HTTP and FastCGI and maybe a few more, and HAProxy is HTTP only. Thus you would have to switch to HTTP output from the application server.
My recommendation would be to stay put on Apache, unless you have a specific functionality need that Apache does not solve for you.
The solution I use, and can be easily implemented with VPS, is the following:
- DNS is round-robin'ed (sp?) to 6 different valid IP addresses.
- I have 3 load balancers with identical configuration and using corosync/pacemaker to distribute the 6 ip adresses evenly (so each machine gets 2 adresses).
- Each of the load balancers has a nginx + varnish configuration. Nginx deal with receiving the connections and doing rewrites and some static serving, and passing it back to Varnish that does the load balancing and caching.
This arch has the following advantages, on my biased opinion:
- corosync/pacemaker will redistribute the ip addresses in case one of the LB fails.
- nginx can be used to serve SSL, certain types of files directly from the filesystem or NFS without using the cache (big videos, audio or big files).
- Varnish is a very good load balancer supporting weight, backend health checking, and does a outstanding job as reverse proxy.
- In case of more LB's being needed to handle the traffic, just add more machines to the cluster and the IP addresses will be rebalanced between all the machines. You can even do it automatically (adding and removing load balancers). That's why I use 6 ips for 3 machines, to let some space for growth.
In your case, having physically separated VPSs is a good idea, but makes the ip sharing more difficult. The objective is having a fault resistant, redundant system, and some configurations for load balancing/HA end messing it up adding a single point of failure (like a single load balancer to receive all traffic).
I also know you asked about apache, but those days we have specific tools better suited to the job (like nginx and varnish). Leave apache to run the applications on the backend and serve it using other tools (not that apache can't do good load balancing or reverse proxying, it's just a question of offloading different parts of the job to more services so each part can do well it's share).
Best Answer
I had to design something similar recently. Here's my conceptual back of the envelope design
Load distribution
I'm assuming that the network ops guys have setup a redundant and highly available routing, firewalling and switching fabric that delivers requests to load balancers.
(If not, I'd go with a stateful HA setup of either PF or IPtables with automatic failover using carp or keepalived. ) The load balancers specifications would depend on whether the web application load distribution methodology and cost among other metrics.
Depending on the budget, the load balancing could be implemented using - Hardware based load balancers which tend to be pricy - Software based proxies such as HAProxy
The load balancers have to be highly available so I'd go for a couple of active load with standby backups (say 2 HAProxy instances with 2 in standby mode).
I'd have routing layer send the requests to the load balancers. In case one of the load balancers failed, a solution based on keepalived would be used to seamlessly replace the faulty box.
Once the load balancers accept requests, they'd pass them on to the caching layer. The caching layer would handle:
The caching layer can be implemented using a solution such as SQUID or NGINX in reverse proxy. By doing this, we'll reduce the load on the application server by sending only dynamic requests to the Apache/PHP servers.
To keep costs at a minimum, I'd have HAProxy and NGINX sitting on the same box.
An easy and scalable way to do this would be to have CSS, JS and static images served by a subdomain of the website (say http://cdn.myservice.com/static). Using such a setup, we could in future install caching instances globally and have DNS send static requests to the closest CDN instance. Initially though, the CDN work can be handled by these NGINX instances to keep costs low.
Processing Layer
The processing layer consists of a pool of servers optimised for Apache/PHP. They would load their configuration files from an NFS or distributed filesystem share and would serve their requests by processing PHP scripts from another remote share (NFS or DFS). Using these remote share eases configuration overheads of maintaining and syncing server configurations.
Apache and PHP could be further optimised for example by:
A memcache server pool can also be configured to store results of common and expensive database queries. Read queries would typically be sent to the slaves if they are not in the memcache layer and their results would then be cached. Writes would be sent to the master and may involve invalidating data in the memcache layer. PHP session data can also be shared over memcached so that if any single Apache/PHP server fails, remaining servers can pick up the session data from memcache.
Scaling for load in the processing pool would be a matter of adding more servers and updating the reverse proxies. The server pool may be partitioned into a number of logical groups. A logical group would then use a common configuration shared over NFS and can be upgraded as block.
The upgrade can then be monitored and if issues are detected, fixes can be implemented or a rollback implemented. The logical group could be distributed over racks that share nothing (Power, network switches etc) and consist of disparate members (say server models A, B and C from Dell) so that block migration tests are comprehensive.
Database
For the database, I'd have a MySQL server running in a master/multi slaves setup. The master would be optimised for writes with binary logging enabled for replication. This typically means that we'd use the usual optimisation for MySQL such as:
Slaves would be configured for reads and would need to be constantly monitored for replication lag using a utility such as maatkit's mk-heartbeat . A lagging server may be removed from the PHP read set until it catches up.
In case of a master failure:
An alternative to DNS may be storing the list of good slaves in a cache such as memcache and updating it appropriately.
To top this off, I'd have a workstation or two for network monitoring and report aggregation. I'd use Munin/Zenoss for trend analysis, A syslog server for aggregation of the servers logs, custom scripts for log analysis and alerting. Nagios may also be used to provide a global overview of the infrastructure and alerting.
Scaling
Upgrading the infrastructure for more load would be handled by: