Mysql – need advice on building a scalable architecture for moodle

apache-2.2Architecturemoodle;MySQLscalability

I'm looking into designing an architecture for a moodle based education site, it will serve several thousand users at first but needs to be able to grow to support hundreds of thousands to millions of users across several countries.

I was thinking of a load balancer to distribute requests to several web servers. The web servers can be split by some serving static and some serving dynamic content. Then it is to write to a mysql master node and read from slave nodes.

What kind of load balancer will work well with moodle, should I get a hardware load balancer solution from one of the vendors, or build one myself with open source solution like LVS or reverse proxy?

I was planning to use apache server to serve the web pages at first then as loads become higher, split into lighttpd webserver for static content and apache application server for dynamic content. Stuff like gzip compressing, squid cache, memcache will also be deployed if required.

For the web server hardware, should I use one-u single socket server or a blade solution? Which one will end up being cheaper to run and expand? Supermicro have an interesting product with twin servers in 1u chassis and 4 servers in 2U chassis with infiniband. Has anyone here tried these server before?

For the storage, should I use a SAN or storage server like Sun unified storage 7000 will be sufficient. For a mysql cluster setup should I have two different storage systems, on to use for master node writing access and another for slave to read? Or should all the nodes have separate storage?

Since this website will likely be more heavy on read operations, what consideration should be made for the mysql cluster and storage setup?

For the management part I am planning to use dsh, ganglia, nagios, splunk, kickstart.

For backup I am planning for an LTO tape autoloader. This site will be primarily used for the Asia region, so there will be several hours of low traffic at night. What is the best way to back up up a mysql cluster? Can I temporarily disable write and take out the master to perform the backup?

Please advise if you have experience with setting up this kind of scalable web site, most of my experience has been in working with large unix boxes, or smaller standalone unix/linux boxes. So this kind of scale out implementation is a first time for me.

Thanks

Robert.

Best Answer

Robert, you're clearly a smart guy, but respectfully, get a consultant with prior domain knowledge, or begin building something small now and see where it takes you. There is no way to answer your post; it has too many abstract concepts and no hard numbers.

A few thoughts:

will serve several thousand users at first ... grow to support hundreds of thousands to millions of users

Prove that you need that level of scale first. Don't build a scale-out architecture in anticipation of users that never show up. Sorry if I sound harsh, but 99% of all websites don't grow to the large end of the scale. See Stack Overflow / Server Fault; they're serving a million users monthly from a handful of fairly conventional servers.

should I get a hardware load balancer solution from one of the vendors, or build one myself with open source solution

Depends on your skills and your situation regarding time vs money. Once built, the open source and commercial offerings work pretty much exactly the same. Commercial solutions tend to have better statistics and nicer management interfaces out of the box.

For the web server hardware, should I use one-u single socket server or a blade solution?

Ask your server vendor for prices. Ask your datacenter about power density, i.e. their preferred balance between size and power consumption -- often you'll be power limited, so a dense solution like blades may not win you anything.

For the storage, should I use a SAN or storage server like Sun unified storage 7000 will be sufficient.

Get SAN when you have a proven need for SAN; then you will also better understand what needs your SAN should solve for you.

Since this website will likely be more heavy on read operations, what consideration should be made for the mysql cluster and storage setup?

Create a really good caching solution. Either full page caching like Squid (Varnish), or application data caching like Memcached, or a combination of both. Consider cache invalidation, could you need to quickly purge content from your caches to avoid it being served again?

What is the best way to back up up a mysql cluster?

Opinions vary, but one common approach is to have a dedicated slave MySQL just for backups, and use something like InnoBackup or Maatkit for a self-scripted backup solution.

Edit: If you're really going to build this from scratch now, then please take a good look at cloud computing before committing. Cloud computing isn't just about scalability, even if scalability is a great strenght. Certain services that come as part of the package can really help in making day to day operations easier. Some examples:

  • Live snapshots of Amazon EBS volumes make for easy database backups.
  • Amazon has load balancing as a set and forget service (of course more feature limited than good self-hosted load balancer, but easy to get started with).
  • Rightscale has extensive server monitoring built into their images, which makes for easy capacity planning / application introspection.