some users wrote us complaining that it was impossible to submit jobs on Baobab with the following error message:
sbatch: error: Slurm temporarily unable to accept job, sleeping and retrying
The reason is that the Slurm queue is full of jobs in either pending or running state. The maximum allowed number of jobs in the queue for the whole cluster is 60k which is already a quite high number. We have spoted users that had a very big number of jobs in the queue and asked them to limit the number of pending jobs they have.
We have as well added a limitation of running and pending jobs per user. This limit is now 10k.