Sbatch: error: Batch job submission failed: Unable to contact slurm controller (connect failure)

Hello,

I cannot submit jobs using sbatch on Yggdrasil:
sbatch: error: Batch job submission failed: Unable to contact slurm controller (connect failure)

Salloc seems to work.

Could you please help me?

Thank you!

I am still having problems with the slurm controller… any news?

For future reference, I found the cause of the error: bad job configuration file.

I copied the config from baobab, but apparently some of the arguments were incompatible.
I suspect I was asking too many cores on a gpu node, but I did not investigate further.

I am still a bit surprised by the SLURM error itself (“connect failure”) tho…

Hi,

Maybe you specified the Baobab cluster in your sbatch and you submited your job from Yggdrasil? This isn’t possible (yet!)

Hey Yann,

I think you are right. I did not specify baobab, but I did use “yggdrasil”!

I thought the problem was the number of cores, but that raises a different error.

I now understand why it complains about the connection… I assume the cluster “yggdrasil” does not exist, right?

The cluster yggdrasil do exist but the configuration we are using is incorrect when you try to use the pragma #SBATCH --cluster=XX. Please do not specify the cluster name in your sbatch/srun.

Best

Yann

Sure, I corrected all my scrips. Thanks!

Hi, I am facing the same error.

sbatch: error: Batch job submission failed: Unable to contact slurm controller (connect failure)

My job does not contain the specifications mentioned above:

#!/bin/sh
#SBATCH --partition=public-cpu
#SBATCH --time=00:30
#SBATCH --mail-user=adriano.rutz@unige.ch
#SBATCH --mail-type=ALL

ml GCC/9.3.0 Singularity/3.7.3-Go-1.14

#convert job array index to three digit padded with zeros
printf -v FILE_INDEX "%04d" ${SLURM_ARRAY_TASK_ID}

FILE=test/test-${FILE_INDEX}.txt

srun singularity run cfm-4/cfm.simg -c "cfm-predict $FILE 0.001 /trained_models_cfmid4.0/[M+H]+/param_output.log /trained_models_cfmid4.0/[M+H]+/param_config.txt 1"

Hi Adriano,

We have updated slurm configuration on baobab at the same time you submitted.

Could you try again and avise me if it’s still not working ?

Hi Adrien,

sbatch --array=1-10 run_cfm_test.sh
Submitted batch job 56373639

Thank you!