Linux – Parent bash script not receiving ‘trap’ despite process still running

bashdaemonlinuxshellunix

What I'm actually trying to achieve:

I'm trying to get a custom daemon working on a system that uses SysVinit. I have the bootstrapper /etc/init.d/xyz script already, which calls my daemon, but it doesn't automatically place it in the background. This is similar to how services like nginx behave: the binary backgrounds itself – i.e. it's not the responsibility of the /etc/init.d/nginx script to daemonise the process, so if you ran /opt/nginx/sbin/nginx directly you would also experience daemonised/background execution.

The Problem

My problem is that using my current method, the daemon doesn't terminate with the parent process (which is what gets terminated when you call service xyz stop).

I'm using a parent launcher.sh script that runs a daemon.sh & script. However, when I kill launcher.sh the daemon.sh continues to run, despite my best efforts with trap (it simply never gets called):

-> launcher.sh

#!/bin/bash

function shutdown {
    # Get our process group id
    PGID=$(ps -o pgid= $$ | grep -o [0-9]*)

    echo THIS NEVER GETS CALLED!

    # Kill process group in a new process group
    setsid kill -- -$$
    exit 0
}

trap "shutdown" SIGTERM

# Run daemon in background

./daemon.sh &

-> daemon.sh

#!/bin/bash

while true
do
    sleep 1
done

To run & kill:

./launcher.sh

<get PID for launcher>

kill -TERM 123 # PID of launcher.sh... which _is_ still running and has its own PID.

Result: daemon.sh still running and the shutdown function never gets called – I've confirmed this before by placing an echo here in the function body.

Any ideas?

EDIT: The launcher.sh script is being run using daemon launcher.sh, where daemon is a function provided by Amazon Linux's init.d/functions file (see here: http://gist.github.com/ljwagerfield/ab4aed16878dd9a8241b14bc1501392‌​f).

Best Answer

The trap command only works as long as the script is running.

The way this is normally done is that, when the daemon is forked off, it writes its PID into a file. The init script then either uses that file to determine what process to kill, or calls your launcher script to kill the process.

For the first instance:

launcher.sh:

/path/to/daemon.sh &
echo "$!" > /var/run/xyz.pid

A simple and somewhat naieve version of /etc/init.d/xyz:

# ... pull in functions or sysconfig files ...
start() {
    # ... do whatever is needed to set things up to start ...
    /path/to/launcher.sh
}
stop() {
    # ... do whatever is needed to set things up to stop ...
    kill `cat /var/run/xyz.pid`
}
# ... other functions ...

A non-naive startup script will depend on which version of linux you're running; I would suggest looking at other examples in /etc/init.d to see how they do this.