Currently, there are are partitions, where all nodes are down:
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug-gpu up 15:00 1 down* gpu001
private-euclid up 7-00:00:00 10 down* cpu[125-134]
private-astro-cpu up 7-00:00:00 18 down* cpu[123-124,135-150]
Is there another big problem arising like [2024] Current issues on HPC Cluster - #23 by Gael.Rossignol?
According to the monitoring of the effected nodes, they went suddenly down short before mid night (Sep.27-28), here an example of node cpu150.