How to properly shutdown ESXi cluster after power outage

upsvmware-esxi

I am integrating UPS EATON with our ESXi 4.1 cluster. Eaton provides shutdown script for one particular ESXi host (shutdownESXi.pl) The problem was, after manual execution of shutdownESXi.pl script on vMA with

  shutdownESXi.pl --server ServerName --username .. --password ..

the host was shut down but the VMs were migrated to another hosts (HA is enabled). BUT, what would happen if all the hosts would undergo the shutdown process ? I am afraid the VMs will start to migrate across the other ESXi hosts and it will never shut down properly.

1) Is there some other best-practice how to shutdown the cluster with the script on the vMA ? (disable HA firstly? maintenance mode?)
2) If someone integrated the Eaton shutdown script, is there any way how to hide the root pass within the Intelligent Power Protector ? It seems to be quite dumb to store it somewhere on the vMA in the plaintext..

Best Answer

Shutting down or rebooting an ESXi host would fall into the definition of 'maintenance' by my reckoning. I'd argue that any shutdown or rebooted host should be in maintenance mode - I seem to remember that you get a prompt from vCenter console if you try and shutdown or reboot a host that isn't in maintenance mode. A script that shuts down a host should put it into maintenance mode first.

Given that putting a host in maintenance mode can't happen until all VMs on the host are either powered off or suspended, it would seem that a UPS shutting down a particular host is a different type of event to shutting down the entire cluster. If a single host is going down, you probably want the VMs migrated onto different hosts. However, if the whole cluster is going down, the script needs to first disable HA on the cluster, then suspend/halt the VMs, then put the hosts into maintenance mode before shutting them down.

It's not clear to me which of the above two possible actions you're looking this script to take. If it's the latter (I'm guessing it is, because you're looking at a complete power outage scenario), you'll probably need to modify it to do the necessary steps before shutting the host down. Looking at the SDK documentation (http://www.vmware.com/pdf/ProgrammingGuide201.pdf) you should be able to do this within the perl script.

Related Topic