Hello
Username: leleu
Cluster: yggdrasil
Subject: system
jobid: 14525939 14525957
I am submitting my job using the following file:
=============
#!/bin/sh
#SBATCH --partition=debug-gpu
#SBATCH --gpus=1
#SBATCH --time=14:59
make -f makefiles/dataset_TESSv5htira_5to20.make model
I submit this job twice, and both are running at the same time. if I ssh on the node they are running on (gpu001 for this test, but it was the same on the gpu of the public-gpu queue), and do :
top -u leleu,
I see two different process. However, if I use
nvidia-smi -l
only the first job I submitted is displayed (see screenshot). if I kill this first job, I get kicked from the gpu001 node as if I didnt have any job running on it anymore. However I can ssh on it again, and if I run the command
nvidia-smi -l
again, this time the remaining second job is displayed. How can I see the gpu usage of both job simultaneously?
Many thanks
Adrien Leleu