Problems with priority on private-kruse-gpu

Hello,

we are several members (muelleel, schaerjo) of the Kruse group and we notice that there seems to be a priority issue in launching jobs among the group for private-kruse-gpu.

This is the bash script used:

#!/bin/env bash
#SBATCH --array=1-9%20
#SBATCH --partition=private-kruse-gpu,shared-gpu
#SBATCH --time=0-00:05:00
#SBATCH --output=%J.out
#SBATCH --mem=3000  
#SBATCH --gpus=ampere:1 
#SBATCH --constraint=DOUBLE_PRECISION_GPU

module load Julia

cd /home/users/m/muelleel/scratch/Debug/2023-06-09_adv_norxn/
srun julia --optimize=3 /home/users/m/muelleel/Code/Debug/2023-06-09_adv_norxn/Circle.jl

I launched a job today morning that should only last 5 minutes and it has not started, whereas my colleague (dumoulil) managed to run several jobs of eight hours.

Thank you in advance.

Hello,

I can confirm that my jobs (dumoulil) are starting immediatly while the other members of the lab have to wait for my hundred of jobs to finish before that their jobs start.

Best,
Ludovic

Hi,

@Ludovic.Dumoulin are you using the same sbatch script than @ella.muller ?

for @ella.muller : you are requesting a double precision card. It means for partition kruse, you have access to gpu[020,030,031]. Right now, all the GPUs on those nodes are in use by @Ludovic.Dumoulin

@ella.muller you have no job in the queue right now, hard to debug. Can you let us know when this happens again so we can check “in live”?

Best

edit: maybe a hint: @Ludovic.Dumoulin launched a job array. It means that the full job array is in the queue and will probably have a higher priority than a recent job you may launch: the reason is that a job in the queue increase its priority while waiting. You can check the priority details with sprio.

Hi,

We are all using the same script, and we use double precision.

Then it makes sens, everything seems to work fine !
Sorry for this,

I’ll see to close the topic !

Thank you,

1 Like