There are many methods for scaling web applications and the supporting infrastructure. Cal Henderson wrote a good book on the subject called "Building Scalable Web Sites". It was based on his experiences with Flickr. Unless you grow slowly you will run into the same kinds of growth issues many others have seen. Scaling is, like many other subjects, a journey not a destination.
The first steps are to make everything repeatable, measurable, and manageable. By repeatable I mean use tools like FAI or kickstart to install the OS and something like puppet or cfengine to configure the machines once a base OS is installed. By measurable I mean use something like cacti, cricket, or ganglia to monitor how your cluster is performing now. Measure not just things like load average but how long it takes to render a page or service a request. Neither of these seem critical when starting out but should tell you before your system falls over from load and can make it simple to add 10 or a hundred machines at once. Base your growth plans on data not guesses.
Manageable means putting the tools in place to allow you automatically generate and test as much of your configuration as you can. Start with what you have and grow it. If you are storing machine info in a database, great. If not you probably have a spreadsheet you can export. Put your configuration in some sort of source control if you haven't already. Automatically creating configurations from your database allows you to grow with less stress. Testing them before they go live on your servers can save you from having a service not start because of typos or other errors.
Horizontal methods assume that you can repeat things appropriately. Think about your application. What areas make sense to split up? What areas that can be handled by many machines in parallel? Does latency affect your application. How likely are you to run into connection limits or other bottlenecks? Are you asking your web servers to also handle mail delivery, database, or other chores?
I've worked in environments with hundreds of web servers. Things should get split differently for different types of load. If you have large collections of data files that rarely change, partitioning them away from the actively changing "stuff" may give more headroom for serving both static and dynamic data. Different tools work better for differing loads. Apache and Lighttpd work well for some things, Nginx works better for others.
Look at proxies and caches. Both between your users and the application and between parts of the application. I read that you are already using memcache, that helps. Putting a reverse proxy like perlbal or pound between your load balancer and the web servers may make sense depending on your application traffic.
At some point you may discover that MySQL master <-> (N * slave) replication isn't keeping up and that you need to partition the databases. Partitioning your databases may involve setting up another layer of data management. Many people use another database with memcache for this management. At one place I worked we used master <-> master replicated pairs for most data and another pair with 10 read slaves for pointers to the data.
This is just a very bare bones description of some of the issues I've run into in working at sites with hundreds of machines. There are no end of things that creep up growing from a few machines to a few hundred. I'm sure the same holds true for growth into the thousands.
Best Answer
What you're looking for, re: justification for a three firewall architecture, sounds like a bit of a fantasy world that isn't going to map well onto reality. Unless you control all the applications, the harsh reality is that most application vendors are assuming unfiltered and unfettered access between the software components from each tier to the adjacent tier (and, possible, to the non-adjacent tier, too).
I've done some work in environments where management-mandated "security" by way of firewalling server computers away from the LAN and minimizing the number of exposed services was employed. It was a challenge every time any new software, hardware, or vendor became involved because all the "traditional" assumptions of unfettered end-to-end connectivity within the LAN were turned on ear. Implementing anything ended up costing more in such an environment.
My strategy and recommendation for limiting communication and exposure within a LAN has been as follows:
Use access-control lists / firewall rules on internal routers / firewalls to "paint with a broad brush" and exclude types of traffic that are very apparently undesirable (access to the subnet / VLAN that the IP security cameras are attached to from anywhere but the VLAN where the video aggregation servers are installed, access to the Internet from a subnet where only internal-facing server computers are installed, etc).
Enforce more specific access-control rules from firewall software running on the various server computers themselves (Windows Firewall, iptables). Ensure that servers have only the required software installed and running, and that only the desired services / daemons are listening for network traffic on only the desired interfaces. Common-sense approaches to change-control, password / SSO security, and keeping operating systems and applications updated rule the day here.
Firewalls allow you to quantify and arbitrate traffic flows. So-called "layer 7" firewalls stick their nose into the application-layer traffic (and even then, at some arbitrary layer of depth into that traffic) and can enforce even more specialized arbitration rules than "traditional" firewalls. Firewalls do not "provide security", though, and are only as effective as the humans designing the rule sets or monitoring the logs. Invariably, the more tightly constrained the rules are initially, the more compromises end up being made to make the applications work.
I'd be dubious of an effort to add firewalls to "add security", personally. I see increased maintenance cost for all applications on the network without any guarantee of a quantifiable improvement in the environment's resistance against attack or diminished risk profile.