Linux – Run multiple jobs on same slurm worker in parallel

linux

We have some fairly fat nodes in our SLURM cluster (e.g. 14 cores). I'm trying to configure it such that multiple batch jobs can be run in parallel, each requesting, for example, 3 cores. However, I can't get that to work.

Example batch job:

#!/bin/bash
#
#SBATCH --job-name=job1
#SBATCH --output=job1.txt
#
#SBATCH -c 3
#SBATCH -N 1
srun sleep 300
srun echo $HOSTNAME

Excerpt from the slurm.conf file:

TaskPlugin=task/cgroup
SelectType=select/cons_res
SelectTypeParameters=CR_CORE
NodeName=some-node NodeAddr=192.168.60.106 CPUs=12 State=UNKNOWN

But, if I run the two jobs, I get the following error:

sbatch: error: CPU count per node can not be satisfied

I found quite a few examples where it sais the sbatch -n option is what controls the amount of CPUs or cores per batch job, however, that does not make sense to me since the documentation states:

Controls the number of tasks to be created for the job

If I try, it it just runs the jobs sequentially:

         JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            16  mainpart     job2  some-user PD       0:00      1 (Resources)
            15  mainpart     job1  some-user  R       4:04      1 some-node

Best Answer

I had the same problem for days with SLURM running only one job per node no matter what I put into the batch files. The following combination of settings finally allowed me to get multiple batches running on a single node.

Before starting, ensure there are no jobs running and drop your nodes. See this answer for more on service vs systemctl for doing so on most linux systems.

sudo service slurmd stop
sudo service slurmctld stop

In /etc/slurm-llnl/slurm.conf (location may differ)

...
SelectType=select/cons_res
SelectTypeParameters=CR_Core
...
NodeName=a NodeAddr=192.168.1.2 CPUs=16 Sockets=2 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=12005 State=UNKNOWN

This is obviously specific to one particular node, and yours will differ. But if the node is not configured correctly, SLURM can return errors about resources being unavailable. To get reliable information about your node, try the following on each node:

sudo slurmd -C

Then use its output to define each node in the controller's slurm.conf file. When things are set up, start SLURM back up again and send it some test batches to see if they spread out across the nodes properly.

sudo service slurmd start
sudo service slurmctld start