Getting more precisions about pending jobs

Yann.Sagon · July 16, 2021, 12:13pm

Hi Quentin,

the reason is indeed not enough resources,i.e, you need to wait for another job to finish to have your job start. If the reason is priority, this means other jobs are in front of you in the queue.

So reason resources may be anything accountable (memory, gpu, cpu, license) I guess. Unless you asked a lot of memory per cpus or gpus, the reason is almost always related to the number of cpus you asked. If the cluster is full, even if you asked for one cpus your job will be pending.

If you ask for example to have a job with 20 cpus per task, this will force your job to request a compute node with at least 20 cpus, avoiding all the nodes with 12, 16 cpus. If your job can run with 12 or 16 cpus, it is better to ask for 12 cpus as you job will start faster. Less resource you ask, the faster your job will start.

More about this here: