Question about priority and job not starting while resources are available

Hi,

I have questions about priority and resource availability. Concretely, my jobs don’t start and I don’t know why.

My personal story:
Yesterday I used the cluster (gpu nodes of shared-gpu partition because my private partition wasn’t “idle”) for approximately 4-6h in total (if I remember correctly).
Today I wanted to run 18jobs of 30min on ampere gpus with double precision constraint. But after few hours none of these jobs are running while the gpu020 of “my” private partition was idle, same thing for gpu[026-028] of the shared-gpu.
When I type squeue --me the reasons are Priority and Resources.

Hence I don’t know if it is a bug or the reason for “resource”.

Also I have questions about the priority:

  • Does it take into account the time requested or only time of usage ?
  • Does it take into account the FLOPS of the gpus ? or a certain amount of time on a P100 and A100 has the same impact on my priority ?
  • What if a job is cancel before execution ? and during execution ?

Sorry for these questions but this “priority” thing is a bit mysterious for me, then I don’t understand why sometime my job are starting immediately or pending for long time.

Thank you in advance for your time,

Best,
Ludovic

Hi,

This is not an official answer or anything but I remember having similar questions before and if you want you can check out this post : questions about pending jobs

There are some commands you can use to check more precisely how “free” a partition is and what might be limiting your job. One thing could be asking too much memory for instance.

Hi,
Thank you !
In my case it is a bit different because I only need GPU.

It seems I can’t start a job on private-kruse-gpu partition.
I submitted a 30min job 14h ago, it doesn’t start…
The two nodes were free (most of time).
I use the pestat -p of your previous post. I get information about the state idle, the number of cpu use 0, and the CPUload, this number is red 0.75* for gpu020 and 0.88* for gpu031.

Only the time of usage.

Only the CPU I guess, but not 100% sure.

no change in the usage/priority.

The time used is taken into account.

1 Like

Can you contact us when you have your job pending as this should not happens. In this case we can have a look at the logs?

and please, share your sbatch script.

Best

Yann

Thank you,
here is my bash script with only private-kruse-gpu, the job 58378431 is pending

#!/bin/env bash
#SBATCH --partition=private-kruse-gpu
#SBATCH --time=0-01:00:00
#SBATCH --gpus=ampere:1
#SBATCH --constraint=DOUBLE_PRECISION_GPU
#SBATCH --output=%J.out
#SBATCH --mem=3000

Hi, thanks.

That is very weird. I’ll open a case at schedmd (Slurm editor) as I don’t understand the reason. I’ll keep you posted.

Best

Yann

Thank you,

I am sorry, you will receive an email because I didn’t know if you were directly notify.

Best,
Ludovic

It would be good if the negative impact on priority for P100 can be set to 4-8 times smaller than the one for A100, because the P100 are way slower… It is why I don’t use the P100 even if I could. I think the P100 should almost be “free” (~1/10) compare to A100, otherwise people will not use it (at least I will not).
I have absolutely no idea if this kind of priority weighting is possible.

Hi,

after restarting Slurm controller, your job finished successfully.

[dumoulil@login2.baobab ~]$ sacct -j 58378431
JobID           JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
58378431         N2C.sh private-k+     krusek          1  COMPLETED      0:0
58378431.ba+      batch                krusek          1  COMPLETED      0:0
58378431.ex+     extern                krusek          1  COMPLETED      0:0
58378431.0        julia                krusek          1  COMPLETED      0:0

It seems we modified Slurm configuration without restarting the services as we’ve added two GPUs recently to this node.

Can you confirm it is running as expected?

Thank you !
It is working fine now !