Ubuntu – Running multiple instances with upstart

Ubuntuupstart

I have a server that I've written that needs to run on multiple ports.

So I've written two Upstart scripts like the upstart documents suggest. http://upstart.ubuntu.com/cookbook/#instance

Task to spawn instances:

# gatling-broadcast.conf
description "Start multiple gatling-broadcast servers"
console output

start on runlevel [2345]

task

script
    for i in `seq 21001 21004`; do
        start gatling-broadcast-worker PORT=$i
    done
end script

Instance script:

# gatling-broadcast-worker.conf
description "Gatling-broadcast server."
instance $PORT  
env ROOT="/home/ctargett/Projects/msgqueue"
exec $ROOT/env/bin/pypy $ROOT/main.py --mode=broadcast --port=$PORT --log_file_prefix=/tmp/gatling-broadcast-$PORT.log

I can run multiple versions of the gatling-broadcast-worker job fine:

$ initctl --session start gatling-broadcast-worker PORT=21001
gatling-broadcast-worker (21001) start/running, process 20956
$ initctl --session start gatling-broadcast-worker PORT=21002
gatling-broadcast-worker (21002) start/running, process 20963
$ initctl --session list
gatling-broadcast stop/waiting
gatling-monitor stop/waiting
gatling-broadcast-worker (21002) start/running, process 20963
gatling-broadcast-worker (21001) start/running, process 20956

Yet when I try and start the gatling-broadcast job I get an error:

$ initctl --session start gatling-broadcast
initctl: Job failed to start

And nothing of help in the output of init other than init: gatling-broadcast main process (21007) terminated with status 1:

Loading configuration from /home/ctargett/Projects/msgqueue/upstart
job_class_unregister: Unregistered job /com/ubuntu/Upstart/jobs/gatling_2dbroadcast
conf_file_destroy: Destroyed unused job gatling-broadcast
conf_reload_path: Loading gatling-broadcast from /home/ctargett/Projects/msgqueue/upstart/gatling-broadcast.conf
parse_job: Creating new JobClass gatling-broadcast
job_class_register: Registered job /com/ubuntu/Upstart/jobs/gatling_2dbroadcast
job_class_unregister: Unregistered job /com/ubuntu/Upstart/jobs/gatling_2dbroadcast_2dworker
conf_file_destroy: Destroyed unused job gatling-broadcast-worker
conf_reload_path: Loading gatling-broadcast-worker from /home/ctargett/Projects/msgqueue/upstart/gatling-broadcast-worker.conf
parse_job: Creating new JobClass gatling-broadcast-worker
job_class_register: Registered job /com/ubuntu/Upstart/jobs/gatling_2dbroadcast_2dworker
job_class_unregister: Unregistered job /com/ubuntu/Upstart/jobs/gatling_2dmonitor
conf_file_destroy: Destroyed unused job gatling-monitor
conf_reload_path: Loading gatling-monitor from /home/ctargett/Projects/msgqueue/upstart/gatling-monitor.conf
parse_job: Creating new JobClass gatling-monitor
job_class_register: Registered job /com/ubuntu/Upstart/jobs/gatling_2dmonitor

job_register: Registered instance /com/ubuntu/Upstart/jobs/gatling_2dbroadcast/_
gatling-broadcast goal changed from stop to start
gatling-broadcast state changed from waiting to starting
event_new: Pending starting event
Handling starting event
event_finished: Finished starting event
gatling-broadcast state changed from starting to pre-start
gatling-broadcast state changed from pre-start to spawned
gatling-broadcast main process (21007)
gatling-broadcast state changed from spawned to post-start
gatling-broadcast state changed from post-start to running
event_new: Pending started event
Handling started event
event_finished: Finished started event
init: gatling-broadcast main process (21007) terminated with status 1
gatling-broadcast goal changed from start to stop
gatling-broadcast state changed from running to stopping
event_new: Pending stopping event
Handling stopping event
event_finished: Finished stopping event
gatling-broadcast state changed from stopping to killed
gatling-broadcast state changed from killed to post-stop
gatling-broadcast state changed from post-stop to waiting
event_new: Pending stopped event
job_change_state: Destroyed inactive instance gatling-broadcast
Handling stopped event
event_finished: Finished stopped event

Best Answer

I would guess the problem is that gatling-broadcast is failing to start one or more of the instances, probably since they are already running. Remember that Upstyart runs your job with 'sh -e' so if any simple command fails, the script will immediately exit. See:

http://upstart.ubuntu.com/cookbook/#debugging-a-script-which-appears-to-be-behaving-oddly

If you are running Ubuntu Precise, have a look at /var/log/upstart/gatling-broadcast.log. You might want to add "set -x" to the top of the script stanza to see exactly where it is failing.

The fix is easy:

script
    for i in `seq 21001 21004`; do
        start gatling-broadcast-worker PORT=$i || true
    done
end script

A better solution though would be to check the status of the specific instance of the gatling-broadcast-worker job and if it's not running attempt to start it. And if that fails, take some appropriate action.