Php + Apache + Mysql + [anything] Server cluster – pitfalls, hints and roadmaps to efficiency

apache-2.2cloudclusteroptimizationPHP

The need has arisen for a clustered server setup (is that what it's called?) in my company. We have our hosting rented abroad, and as such have limited access to the actual hardware, but we have total freedom and are not restrained by financial resources (provided we avoid overkill, of course – no need for 300 servers if 3 can handle things).

We are an international online publisher serving free online readable books. This means we have a ton of static content – primarily many, many gigabytes of flash documents. We recently went and upgraded the server OS to CentOS x64, and changed the server software from Apache to Nginx(for static content)+Apache. There were some problems, however, and we faced some unexpected downtime, which damaged us pretty severely, even if it was only for a couple hours.

My thoughts on a cluster setup were as follows:
– server 1: our current MySQL database.
– server 2, server 3, server 4: our Application, that is, our PHP code on Apache
– server 4: static content only (images from 5kb to 3mb, PDFs from 5mb to 100MB, flash files from 200kb to 20MB, etc..) powered by Cherokee

I believe this setup would help us avoid downtime should one of the three application servers fail, in addition to sharing the load among three servers unlike now when everything (static + DB + application) was on one machine.

What I would like from you veterans is some helpful links about server load sharing, hints and tips regarding this issue and my proposed setup above.. I have limited experience with Apache as a PHP developer, and not much more, so if anyone can offer any valuable insight into their setups or experiences with different hardware/software, I would be much obliged.

Also, what is the correct terminology? Cloud? Cluster? Any other terms I should be aware of. Please be gentle, I'm only beginning to tread into the server world.

Thank you

Edit: new plan is as follows, please let me know what you think:

Application Cluster:

  • 3 servers running Nginx (or Cherokee) and Apache with PHP. Nginx would handle requests for static content on the same server (CSS, JS, thumbnails, sprites, images)
  • Since we currently have 2 web sites with rather large traffic (one high on DB updates, the other high on static content serving), we were thinking of putting both on this application server.
  • The two applications would have two load balancers to distribute traffic among the three servers. The servers would be identical clones, and easily scalable later on.

Database Cluster

  • Two servers running MySQL, clones. Load balancer. Backups would be done on themselves, as it is highly unlikely both would die at the same time. Both applications on App cluster will use this cluster – one will perform an average read load, the other a high read-write load.

Static Cluster

  • Two servers with static content exclusively, basically just storage for thousands of PDFs, Zips and Flash files. No backup, impossible to perform efficiently. Servers are each other's backup. This static cluster will serve larger static content for both applications on the App cluster.

Is this realistic? What would you advise against, if anything? What would you add?

Best Answer

A few general things that I've learned over the years:

  • See this question for a list of good books on the subject of performance, scaling and high availability sites.
  • "Cluster" is the correct term. You're using multiple machines to serve one site in an attempt to increase availability. You can also use cluster to refer to specific portions of your setup: for example servers 2+3+4 would be your application cluster.
  • Is there any reasons why you only have redundancy on the application level? What about MySQL and static content? Especially since your static content is relatively large look at how much bandwidth you can serve to N concurrent users if needed. What happens if the MySQL server fails or if server #4 has a bad disk?
  • If you're moving everything from one machine start off small unless you don't mind spending more than you need. For example, I found a larger than expected performance gain in a similar situation moving from 1 to 3 servers. After you split into multiple servers you may find the new bottleneck is in a different area.
  • As you plan for scaling now don't completely forget about possible future scaling. A little forward thought and design now can save you time in the future. For example, you have one static server now but what you want multiple in a year, or several servers spread out geographically.
  • Consider creating scripts to help setup specific types of servers...doing it manually each gets old and you always forget one step. I did this recently and wish I had done it from the start. Running one script that does 50 install steps automatically in a few minutes saves you much time in the long run.
  • As you get more servers the likely-hood of experiencing some sort of hardware failure becomes higher. Plan for this and play the what-if game: What if the hard drive failed on server X? What would we lose? How long would the site be out? How long would it take to fix it? etc...