Running 100 virtual machines on a single VMWare host server

hostscalabilityvirtual-machinesvirtualizationvmware-esx

I've been using VMWare for many years, running dozens of production servers with very few issues. But I never tried hosting more than 20 VMs on a single physical host.
Here is the idea:

  1. A stripped down version of Windows XP can live with 512MB of RAM and 4GB disk space.
  2. $5,000 gets me an 8-core server class machine with 64GB of RAM and four SAS mirrors.
  3. Since 100 above mentioned VMs fit into this server, my hardware cost is only $50 per VM which is super nice (cheaper than renting VMs at GoDaddy or any other hosting shops).

I'd like to see if anybody is able to achieve this kind of scalability with VMWare? I've done a few tests and bumped into a weird issue. The VM performance starts degrading dramatically once you start up 20 VMs. At the same time, the host server does not show any resource bottlenecks (the disks are 99% idle, CPU utlization is under 15% and there is plenty of free RAM).

I'll appreciate if you can share your success stories around scaling VMWare or any other virtualization technology!

Best Answer

Yes you can. Even for some Windows 2003 workloads as little as 384MiB suffices, so 512MiB is a pretty good estimation, be it a little high. RAM should not be a problem, neither should CPU.

A 100 VMs is a bit steep, but it is doable, especially if the VMs are not going to be very busy. We easily run 60 servers (Windows 2003 and RHEL) on a single ESX server.

Assuming you are talking about VMware ESX, you should also know that is able to overcommit memory. VMs hardly ever use their full appointed memory ration, so ESX can commit more than the available amount of RAM to VMs and run more VMs than it actually 'officially' has RAM for.

Most likely your bottlenech will not be CPU or RAM, but IO. VMware boasts huge amounts of IOPS in their marketing, but when push comes to shove, SCSI reservation conflicts and limited bandwidth will stop you dead way before you'll come close to the IOPS VMware brags about.

Anyway, we are not experiencing the 20 VM performance degradation. What version of ESX are you using?