Full partitions are down on yggdrasil

Currently, there are are partitions, where all nodes are down:

PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug-gpu up 15:00 1 down* gpu001
private-euclid up 7-00:00:00 10 down* cpu[125-134]
private-astro-cpu up 7-00:00:00 18 down* cpu[123-124,135-150]

Is there another big problem arising like [2024] Current issues on HPC Cluster - #23 by Gael.Rossignol?

According to the monitoring of the effected nodes, they went suddenly down short before mid night (Sep.27-28), here an example of node cpu150.

1 Like

Hi, Any news on this? It is quite hard to get some GPU time even for short (<15min) jobs.
Thanks!

Please check the reason here: [2024] Current issues on HPC Cluster - #24 by Yann.Sagon