Linux – get notification when systemd-monitored service enters failed state

bashlinuxservicesystemd

I need to have network messages sent when a systemd service I have crashes or is hung (i.e., enters failed state; I monitor for hung by using WatchdogSec=). I noticed that newer systemd have FailureAction=, but then saw that this doesn't allow arbitrary commands, but just rebooting/shutdown.

Specifically, I need a way to have one network message sent when systemd detects the program has crashed, and another when it detects it has hung.

I'm hoping for a better answer than "parse the logs", and I need something that has a near-instant response time, so I don't think a polling approach is good; it should be something triggered by the event occurring.

Best Answer

systemd units support OnFailure that will activate a unit (or more) when the unit goes to failed. You can put something like

 OnFailure=notify-failed@%n

And then create the notify-failed@.service service where you can use the required specifier (you probably will want at least %i) to launch the script or command that will send notification.

You can see a practical example in http://northernlightlabs.se/systemd.status.mail.on.unit.failure