[SLURM] Rejection GPU job submissions without GPU Requests

Dear Users,

We have noticed instances where node resources in the GPU partition were allocated without any corresponding GPU requests from job submissions. This type of resource allocation can lead to inefficient utilization of resources and hinder the availability of GPUs for other users.

To ensure the appropriate use of GPU resources, we have implemented a new policy. Going forward, any job submission to the GPU partition will require a corresponding GPU request. This means that if you intend to use GPU nodes without requesting GPU resources, your submission will be rejected.

This policy change is aimed at promoting fair resource allocation and optimizing the availability of GPUs for users who genuinely require them. We kindly request all users to review their job submission scripts and ensure that the appropriate GPU requests are included.

Additionally, we would like to remind you that if your job does not require GPU resources, the CPU partition must be used. The CPU partition is designed for jobs that primarily rely on CPU processing power without the need for GPU acceleration.

By 'adhering' to these guidelines and selecting the appropriate partition for your job, you contribute to the efficient allocation of resources and promote fair usage across our system.

Example:

(baobab)-[alberta@login2 ~] $ srun --partition=shared-gpu hostname
srun: error: You are trying to submit on gpu partition without requesting gpu, do you really need to use a gpu node ? 
srun: error: Unable to allocate resources: Invalid generic resource (gres) specification

If you notice any unexpected behavior, we encourage you to respond to this post and provide us with your feedback.

Thank you for your cooperation.

4 Likes