Why is there concept of slots in SGE


According to SGE 5.3 Manual,

Slots – The number of jobs which may be executed concurrently in that queue

I am new to these concepts and want to start by understanding one by one.

For suppose, if RAM is 10G and if there are 10 slots, and hence 1G per slot, then does only jobs less than 1G can be run? And if the job needs only some 0.5G, it will be wasting the remaining 0.5G in that slot right? And if so, what's the use of grid then if there is no optimization of the resources?

And if a job of 2G is shared among multiple slots, is this called a parallel job or normal job?

And is there any difference between queue and slot concept in SGE v5.3 and v6.0 & above?

Best Answer

A CPU core (barring hyperthreading or similar) can only run one process at a time. On a desktop or regular web server it switches between processes very fast to create the illusion that multiple processes are being run simultaneously. This however lowers overall CPU performance as the switching has costs (swapping, cache invalidation,context switches). This doesn't matter when the core spends most of its time waiting for IO (like user input/network connections) but in HPC/HTC (the main use case for grid engine)each program is written to make efficient use of resources so you get work done quicker if you have a batch system that arranges for programs to be run one after another rather than switching between them.

In such circumstances grid engine is normally configured to use slots to represent cores in order to prevent overcommit.

Grid engine can be configured to track memory separately from cores/slots.

Related Topic