Could someone provide me with solution for the following problem:
I have ESXi 4.0 enterprise edition (old one, but that can not be changed).
I want to monitor ESXi from Nagios server. And I use check_esx3-0.5.pl plugin http://exchange.nagios.org/components/com_mtree/attachment.php?link_id=2154&cf_id=29 from Nagios exchabge page.
That plugin for monitoring ESXi on Nagios server does work when type command from Nagios server. Here is what I have:
./check_esx3-0.5.pl -H 172.32.3.3 -u user -p password -l vmfs
And result is:
{CHECK_ESX3-0.5.PL OK - Storages : iSCSI ibm storage=219492.00 MB (23.02%)
, datastore esxi04=60704.00 MB (93.36%), Storage Backup=853604.69 MB (60.13%)
| 'iSCSI ibm storage'=219492.00MB;; 'datastore esxi04'=60704.00MB;; '
Storage Backup'=853604.69MB;;}
Above is completely correct answer.
Here are my settings – commands definitions:
define service{
use local-service ; Name of service template
to use
host_name esxi03.troxo.net
service_description PING
check_command check_ping!100.0,20%!500.0,60%
contact_groups admins
}
VMWare check cpu
define service{
use local-service
host_name esxi03.troxo.net
service_description ESXi CPU Load
check_command check_esx_cpu!80!90
}
Define a service to check the Memory Usage on the remote machine.
Warning if > 80%, critical if > 90%.
check memory usage
define service{
use local-service
host_name esxi03.troxo.net
service_description ESXi Memory usage
check_command check_esx_mem!80!90
}
Define a service to check RunTime Status on the remote machine.
check runtime status
define service{
use local-service
host_name esxi03.troxo.net
service_description ESXi Runtime status
check_command check_esx_runtime
}
check io read
define service{
use local-service
host_name esxi03.troxo.net
service_description ESXi IO read
check_command check_esx_ioread!40!90
}
check io write
define service{
use local-service
host_name esxi03.troxo.net
service_description ESXi IO write
check_command check_esx_iowrite!40!90
}
Define a service to check vmfs free space on the remote machine.
check io write
define service{
use local-service
host_name esxi03.troxo.net
service_description ESXi VMFS Free Space
check_command check_vmfs
}
}
I added to resources.cfg user & pass for login to ESXi. commands.cfg file looks like:
check io write
define command{
command_name check_esx_iowrite
command_line $USER1$/check_esx3-0.5.pl -D $HOSTADDRESS$ -u $USER3$ -p $USER4$
-l io -s write -w $ARG1$ -c $ARG2$
}
check vmfs
define command{
command_name check_vmfs
command_line $USER1$/check_esx3-0.5.pl -H $HOSTADDRESS$ -u $USER3$ -p $USER4$ -l vmfs
-w $ARG1$ -c $ARG2$
}
Only ping command is OK. Others are declared as Unknown. Here is what do I have on Nagios page:
ESXi VMFS Free Space
UNKNOWN 03-08-2012 13:56:34 ..... Usage: check_esx.pl -D <data_center>
user3 & user4 are login & pass for ESXi on resources.cfg
I do not have Center Server installed – not necessary for 6 ESXi servers.
So, could someone help me to solve this issue?
Best Answer
Your command definition is expecting ARG1 and ARG2 (for -w and -c, respectively), but you are not passing them in after check_vmfs in your service def's command line.
So the resulting command being executed by Nagios is something like "/path/to/check_esx3-0.5.pl -H 172.32.3.3 -u user -p passwd -l vmfs -w -c", which is why it's returning "error: here's the usage help" instead of what you expect.
You should either remove the -w and -c pieces from the command definition, or add some (I'm assuming) thresholds to your service definition. E.g., "check_vmfs!10!20" or similar.