(base) (baobab)-[shekhza2@cpu322 benchmark]$ squeue -p private-kalousis-gpu
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
10159856 private-k ant-migr shekhza2 PD 0:00 1 (Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)
There’s no need to send or post a message about down or drained nodes. We check every day which nodes are ready to go back into production. We wait until all work has been completed before intervening on a node to avoid any impact on production
PS: In your sbatch, you can specify multiple partitions, allowing your jobs to start the first available node according to your priority.