Linux – Local docker execution and CTRL-C signal propagation

dockerlinux

I have a cluster submission system based on docker and I'm trying to get it to also support local execution. When executing locally, the command that starts the job is basically

docker run /results/src/launcher/local.sh

For cluster execution another script is being run instead. The difficulty I'm facing is how to run the code as the local user while still supporting CTRL-C correctly. Since docker run starts the entrypoint as uid 0, I need to run the user's entrypoint with su -c. Basically, the script needs to run two things:

  1. A prerun script (called as root)
  2. A Python program (called as calling user)

The meat of the script is currently the following:

# Run prerun script
$PRERUN &
PRERUN_PID=$!
wait $PRERUN_PID
PRERUN_FINISHED=true
status=$?

if [ "$status" -eq "0" ]; then
    echo "Prerun finished successfully."
else
    echo "Prerun failed with code: $status"
    exit $status
fi

# Run main program dropping root privileges.
su -c '/opt/conda/bin/python /results/src/launcher/entrypoint.py \
      > >(tee -a /results/stdout.txt) 2> >(tee -a /results/stderr.txt >&2)' \
      $USER &
PYTHON_PID=$!
wait $PYTHON_PID
PYTHON_FINISHED=true
status=$?

if [ "$status" -eq "0" ]; then
    echo "Entrypoint finished successfully."
else
    echo "Entrypoint failed with code: $status"
    exit $status 
fi

Signal propagation is handled in the same script by:

_int() {
    echo "Caught SIGINT signal!"
    if [ "$PRERUN_PID" -ne "0" ] && [ "$PRERUN_FINISHED" = "false" ]; then
        echo "Sending SIGINT to prerun script!"
        kill -INT $PRERUN_PID
        PRERUN_PID=0
    fi
    if [ "$PYTHON_PID" -ne "0" ] && [ "$PYTHON_FINISHED" = "false" ]; then
        echo "Sending SIGINT to Python entrypoint!"
        kill -INT $PYTHON_PID
        PYTHON_PID=0
    fi
}

PRERUN_PID=0
PYTHON_PID=0
PRERUN_FINISHED=false
PYTHON_FINISHED=false
trap _int SIGINT

I have a signal handler in /results/src/launcher/entrypoint.py, which is the code run by su -c. However, it never seems to get the SIGINT. I assume that the problem lies in the su -c. As expected PYTHON_PID in the bash script isn't assigned the PID of the python interpreter, but of the su program. If I do a os.system("ps xa") in my Python entrypoint, I see the following:

  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:00 /bin/bash /results/src/launcher/local.sh user 1000 1000 /results/src/example/compile.sh
   61 ?        S      0:00 su -c /opt/conda/bin/python /results/src/launcher/entrypoint.py \       > >(tee -a /results/stdout.txt) 2> >(tee -a /results/stderr.txt >&2) user
   62 ?        Ss     0:00 bash -c /opt/conda/bin/python /results/src/launcher/entrypoint.py \       > >(tee -a /results/stdout.txt) 2> >(tee -a /results/stderr.txt >&2)
   66 ?        S      0:01 /opt/conda/bin/python /results/src/launcher/entrypoint.py
   67 ?        S      0:00 bash -c /opt/conda/bin/python /results/src/launcher/entrypoint.py \       > >(tee -a /results/stdout.txt) 2> >(tee -a /results/stderr.txt >&2)
   68 ?        S      0:00 bash -c /opt/conda/bin/python /results/src/launcher/entrypoint.py \       > >(tee -a /results/stdout.txt) 2> >(tee -a /results/stderr.txt >&2)
   69 ?        S      0:00 tee -a /results/stdout.txt
   70 ?        S      0:00 tee -a /results/stderr.txt
   82 ?        R      0:00 /opt/conda/bin/python /results/src/launcher/entrypoint.py
   83 ?        S      0:00 /bin/dash -c ps xa
   84 ?        R      0:00 ps xa

PYTHON_PID is assigned the PID 61. However, I would like to be able to gracefully shutdown the python interpreter, so I should be able to catch some signal there. Does anyone know how to forward a SIGINT to the Python interpreter in a situation like this? Would there be a smarter way to do what I'm trying to accomplish? I have full control over the code that puts together the docker run command when code is scheduled for local execution.

Best Answer

There are a few things going on here. First, you are running a shell script as pid 1 inside the container. That process in various scenarios is what sees the cont+c, or docker stop sending the signal, and it is up to bash to trap and handle it. By default, when running as pid 1, bash will ignore the signal (I believe to handle single user mode on a Linux server). You would need to explicitly trap and handle that signal with something like:

trap 'pkill -P $$; exit 1;' TERM INT

at the top of the script. That would catch the SIGTERM and SIGINT (generated by a cont+c), kill child processes, and exit immediately.

Next, there is the su command, which itself forks a process that can break signal handling. I prefer gosu which runs an exec instead of a fork syscall, removing itself from the process list. You can install gosu with the following in a Dockerfile:

ARG GOSU_VER=1.10
ARG GOSU_ARCH=amd64
RUN curl -sSL "https://github.com/tianon/gosu/releases/download/${GOSU_VER}/gosu-${GOSU_ARCH}" >/usr/bin/gosu \
 && chmod 755 /usr/bin/gosu \
 && gosu nobody true

Lastly, there's a lot of logic in the entrypoint to fork and then wait for a background process to finish. This could be simplified by running the processes in the foreground. The last command you run can be started with an exec to avoid leaving the shell running. You can catch errors with set -e, or expand that to show debugging of what commands are being run with a -x flag. The end result looks like:

#!/bin/bash

set -ex

# in case a signal is received during PRERUN
trap 'exit 1;' TERM INT

# Run prerun script
$PRERUN

# Run main program dropping root privileges.
exec gosu "$USER" /opt/conda/bin/python /results/src/launcher/entrypoint.py \
      > >(tee -a /results/stdout.txt) 2> >(tee -a /results/stderr.txt >&2)

If you can get rid of the /results logs, you should be able to switch from /bin/bash to /bin/sh at the top of the script, and just rely on docker logs to see the results from the container.