What is the correct way to kill Nagios's ndo2db
daemon?
When I shutdown nagios
and ndo2db
I do the following:
/etc/init.d/nagios stop
/etc/init.d/ndo2db stop
and I see the following in nagios.log
:
[1311865619] Caught SIGTERM, shutting down... [1311865619] Successfully shutdown... (PID=12422) [1311865619] ndomod: Shutdown complete. [1311865619] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
However /etc/init.d/ndo2db stop
outputs the following message to the console:
ndo2db was not running… could not stop
If I do ps -ax | grep nagios
I still see a ndo2db process running:
12381 ? Ss 0:00 /usr/local/nagios//bin/ndo2db -c /usr/local/nagios//etc/ndo2db.cfg
Which I then have to manually kill before restarting ndo2db, otherwise I get:
[root@nag01 nagios]# /etc/init.d/ndo2db start Starting ndo2db:Could not bind socket: Address already in use done. [root@nag01 nagios]#
Is there a cleaner way of doing this?
I'm running:
- Nagios 3.2.3 built from source
- NDO Utils 1.4b9 built from source
- Centreon 2.2.1 Stable
- Centos 5.5 x64
- MySQL 5.5 x64
Update:
One of the odd(?) this I've noticed is that when ndo2db and nagios are running I see two instances of ndo2db:
12753 ? Ss 0:00 /usr/local/nagios//bin/ndo2db -c /usr/local/nagios//etc/ndo2db.cfg 12792 ? S 0:00 /usr/local/nagios//bin/ndo2db -c /usr/local/nagios//etc/ndo2db.cfg
Is this normal? If so then my guess is that the stop
part of the init.d
script is only killing one process?
Best Answer
I found the culprit - it was Centreon's config file builder.
ndo2db
has alock_file
setting which is missing from the Centreon config UI.When Centreon generates the config files it also generates
ndo2db.cfg
- but without thelock_file
configuration value.There's an open issue about this:
Having spelunked the source code, when
ndo2db
daemonises and if there isn't alock_file
setting then it ignores this and carries on and no lock file is written containing the PID.This of course means that the
stop
function in the init script won't be able to identify the ndo2db process id so it can be killed.Update:
To resolve this issue I manually added a new column to the
cfg_ndo2db
table in thecentreon
database:I then populated it with the path of my ndo2db lock file:
This will force centreon to write the
lock_file
setting each time the config is generated. This also appears to survive upgrades as well, though I'd always check the database upgrade scripts to ensure this doesn't sneak in as an undocumented fix.