Set up SGE to Fill Each Node Completely Rather than Distribute Jobs

clustergridengine

Originally posted on Stack Overflow by mistake… See PS at bottom for response from that post.

I've search for this a while, but cannot find the answer. The problem I have is this: assume I have a SGE set up with two 12-CPU machines. I have two 1-CPU jobs to submit to the grid, but other users will often want to submit 12-CPU jobs. These are shared memory jobs that cannot be split across multiple machines. What happens is that sometimes I'll submit my two jobs and they'll each go to a separate machine, leaving each with 11/12 CPUs free. This then prevents others from running 12-CPU jobs while I'm working.

Is there a way around this? I know that you can use the fillup rules to control a single qsub (so fillup can make a 12-CPU qsub either stay on one machine, split between several, etc.), but is there a comparable setting to force separate qsub's to go to the same machine? I also know I can explicitly request a particular machine (I think it is -h machinename, or something similar), but I'd prefer to have a more robust setup than this.

Any help is appreciated. Thanks!

PS: On the Stack Overflow post, one response came in before the thread was closed suggesting using the parallel environment allocation_rule=$fill_up. Unless I've done something wrong in trying it, I don't think this satisfies the problem. From what I've seen testing, if I set to fill_up this means that the CPUs requested WITHIN a single qsub are put to the same grid machine if possible, but CPUs from DIFFERENT qsubs will still go to the low-load machine (or whatever the grid chooses), and might go to an empty machine. Testing for this involved qsubbing a few single CPU jobs, waiting ~5 min, then submitting a few more. Although sometimes the first group would end up on the same machine (I'm guessing because machine load isn't real time, so they all were sent to the same low-load machine?), but the second group would not consistently go to the same machine as the first group.

Best Answer

The scheduler's default load_forumla setting is np_load_avg which assigns new jobs to nodes with the lowest load average. To have it fill nodes instead you can set load_formula to slots. To see the current scheduler settings:

qconf -ssconf

To modify the settings:

qconf -msconf
Related Topic