I am wondering how to call a specific GPU in the bash call.
Here is my situation: Is it possible to actually call the RTX 3080 GPU on baobab ?
In the doc, you can specify the VRAM per GPU, the precision, so I most of the time do:
Dear @Yann.Sagon , you are right, I had two RTX 3090.
Yet is there a way to call a specific GPU ?
And were can I get a glance of GPU utilisation, to adapt my sbatch call ?
The reason:
I actually want to launch job where my interest is the total Vram, that I can share between N GPU. I am trying to get the most disponibility on baobab, to be able to launch multiple jobs, for example ~100 jobs with lets say 30G of Vram.
3x10G would work, but if the RTX 3080 are not available, then I jobs use 3x25, or 3x48 (on node 48), which is a pity and way to much for my job (I would rather use 1x48 for the jobs, and thus get more jobs done for same dispo of Gpus).
Thus my desire to call a specific GPU: In my example, per job, I would call 3x10 RTX 3080 if those are available, or 2x25G RTX A5500 or RTX 3090 if these are the GPU available, or 1x48 RTX A6000 if those are available.
But
#SBATCH --constraint=COMPUTE_MODEL_RTX_3080_10G
Does not work, although you seemed to propose this here:
I would like to better understand your workflow to give you a more accurate answer.
If I understand correctly, each of your job is able to use let say three GPUs with 10GB RAM at the same time and it “sees” 30GB of GPU RAM? Is that correct?
When you are talking about 100 jobs: are you referring to 100 sbatch instances or job array or one sbatch with 100 job inside using a for loop for example?
This was never implemented as at the end the usage of VramPerGpu did the trick for the user.
As this may be needed for your use case, I have implemented that this afternoon on Baobab. You can check on the documentation the constraint name to use to target a specific GPU.
In the output you can see have a look a the columns GRES and GRES_USED that will display the number of GPUs vs the number of allocated GPUs.
One more thing: if you specify that you want a GPU with 10GB of RAM and there’s none available, you’ll get a model with more RAM, but the model will be chosen based on its “weight”. The heavier it is, the less likely you’ll get it. Check our documentation to see the weight associated with each GPU.
If I understand correctly, each of your job is able to use let say three GPUs with 10GB RAM at the same time and it “sees” 30GB of GPU RAM? Is that correct?
Yes! I am using llms locally, and I load the model on various GPU.
When you are talking about 100 jobs: are you referring to 100 sbatch instances or job array or one sbatch with 100 job inside using a for loop for example?
Speaking about job array here.
One more thing: if you specify that you want a GPU with 10GB of RAM and there’s none available, you’ll get a model with more RAM, but the model will be chosen based on its “weight”. The heavier it is, the less likely you’ll get it. Check our documentation to see the weight associated with each GPU.
Yes, and that is why I would like to avoid that: I don’t want to have a use multiple “heavy” GPU when light ones would be enough: it penalises me for future jobs (longer queues).