_gpu[002,012]_ still down at 16:00 on 2020-06-18

Hi there,

The service restored at 11:44 was indeed referring to the ${HOME} storage only (cf. Current issues on Baobab and Yggdrasil - #13 by Luca.Capello ).

gpu[002,012] are still DOWN in Slurm given that:

  • gpu002 PSU2 broke, and given that this node has 6 TITAN X one PSU is not enough, replacement already asked for.
  • gpu012 was fine, but while doing the last check before Slurm activation I found another problem in the same rack (leaf7 , cf. Current issues on Baobab and Yggdrasil - #15 by Luca.Capello ).

Thx, bye,
Luca