Bash – Problems in monitoring file age with nagios (command substitution in filename)

bashnagios

In my current nagios installation I check for the existence of a remote backup through NRPE. In particular, my remote nrpe.cfg (on a Ubuntu 8.x) is:

command[check_zimbra_backup]=/usr/lib/nagios/plugins/check_file_age -f \\
/backupdir/zimbra_backup-$(date +%a).tar.gz -w 518400 -c 86400 

Running the command locally returned OK:

$ sudo su -m nagios -c "/usr/lib/nagios/plugins/check_file_age -f \\
/backupdir/zimbra_backup-$(date +%a).tar.gz -w 518400 -c 86400 "
FILE_AGE OK: /backupdir/zimbra_backup-Sun.tar.gz is 47661 seconds old and 10863637475 bytes

However, my logs showed CRITICAL:

nagios: SERVICE NOTIFICATION: zimbra backups;CRITICAL;notify-service-by-email;
FILE_AGE CRITICAL: /backupdir/zimbra_backup-Sun.tar.gz is 22373 seconds old and 10863637475 bytes 

Notice how it returns a critical state depsite the fact that the number of seconds reported (22373) is smaller than the warning parameters (86400s, or 24 hours).

The interesting bit is that running the NRPE plugin remotely returns something strange:

$ sudo su -m _nagios -c "/usr/local/libexec/nagios/check_nrpe -H HOST \\ 
-c check_zimbra_backup"
FILE_AGE CRITICAL: /backupdir/zimbra_backup-Sun.tar.gz is 23611 seconds old and 10863637475 bytes
ҷ?Oڷ`xڷ

Notice the last line, which looks like some sort of garbled output.

The check_file_age plugin is version v1750 (nagios-plugins 1.4.11)

Best Answer

The solution was to change the command substitution from

[...]zimbra_backup-$(date +%a).tar.gz 

to

[...]zimbra_backup-`date +%a`.tar.gz 

Looks like nagios performs some odd command substitution when it encounters a $ sign.

Related Topic