Monitor DELL hardware on VMware ESXi 5.5 server

dellnetwork-monitoringsnmpvmware-esxi

Despite researching this topic quite a bit online (to be fair I'm not a full time sysadmin) I'm unable to figure this out.

We have a bunch of VMWare ESXi 5.5 servers, some of which are integrated into vSphere, some of which are not (for cost reasons).

All of them run the standard ESXi image, with the exception of one machine which is actually running the DELL VMWare ESXi image.

What I would like to accomplish seems simple: Configure the system so that it can be queried via SNMP from a remote host, whether it's snmpwalk, Nagios, PRTG etc. I'd like to see information from temperature sensors, installed disks and their status, fan speed, PSU status etc.

I was under the impression that installing the VMWare version from DELL would automagically enable the necessary modules (OpenManage most importantly), but it seems like that is not the case.

I have conflicting information whether this is even possible at all, some documents say that you cannot query a DELL VMWare ESXi server via SNMP and you need to use a CIM client. Then there is the OMSA VIBs one can install, etc.

I imagine this being a fairly common requirement, yet the docs available pull one in all different directions.

Is what I am trying to do possible (without a complete vSphere environment) even possible?

Best Answer

Yes, you can monitor the standalone ESXi Host using any SNMP monitoring software but some items may only be visible using a monitoring tool that supports the CIM protocol.

All of my ESXi Hosts are part of vCenter but we monitor them directly (using the vmkernal Host IP address) with SolarWinds NPM. There are 5 or 6 CIM modules built into ESXi 5.5 that give you hardware health but RAID card health is not one of them. You will need to add the Dell OMSA VIB that adds the additional CIM agents including the one for the RAID array. Brian Atkinson's post is still the best I have found that describes the process,

https://communities.vmware.com/people/vmroyale/blog/2012/07/26/how-to-use-dell-dset-with-esxi

You only need to follow the instructions for installing the OMSA ESXi VIB if you are going to use a third party monitoring tool that gives historical information and does alerting. If you wish to use the Dell OMSA Server you can install it remotely on bare bones server, remotely in a VM or locally as a VM.

You can use the OMSA server to connect to DRAC and iDRAC Out of Band (OOB/ IPMI/ iLo) management cards or to the ESXi Host after you install the OMSA VIB on the ESXi Host. You will not see the RAID Health information in the DRAC or iDRAC though - only when connecting the OMSA Server to an ESXi Host - I repeat the Server keyword so there is no confusion between the Server which is acting as a client to the OMSA VIB that is installed on the ESXi Host.

Some useful resources:

Show the current CIM providers on an ESXi Host https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053715

Show the currently installed VIBs on the ESXi Host from the Host's CLI, esxcli software vib list

You do see some minor additional hardware health details when you connect to a vCenter server versus the ESXi Host directly but generally if you do not see the hardware health you are looking for in the Configuration/ Health Status panel then you are missing a CIM provider and you need to locate and install the VIB on the ESXi Host. When you add the Dell OMSA VIB to the ESXi Host you will see a Storage sensor added to the Health Status page which shows the RAID volumes, drives, controller and battery health for your storage controller. You may need to reset the sensors for it to show up and sometimes it takes 15 to 20 minutes the first time after the VIB install and reboot of the ESXi Host.

If you do not see a sensor on the ESXi Host's Health Status page when you connect with the vSphere Client then you are most likely not going to see it when you are remotely polling the sensors with monitoring software.

Also you should note that not all servers have the same sensors and you may not be able to get the same health status from all depending on the Server hardware, RAID card and version of the CIM available for the combination. You may also need to upgrade or change the VIBs for the RAID card in order for the health status to work. The CIM provider (which is the OMSA VIB in this case) talks to the hardware through the device VIB (the real device driver) and passes this information to the CIM Broker on the ESXi Host - also known as the Small Footprint CIM Broker Daemon (sfcbd). When you poll the ESXi Host for hardware health using robust monitoring software it will get some information using SNMP queries, some using CIM and some using the ESXi API (which are SOAP requests). The CIM client talks to the sfcbd process on the ESXi Host.

Sometimes the CIM process just stops working. When that happens you will be restarting the sfcbd-watchdog process on the ESXi Host. This will restart the sfcbd service and CIM polling will work again. From the CLI of the Host, /etc/init.d/sfcbd-watchdog restart

I think that covers most of the items you need to get you running.