Yggdrasil scheduled maintenance: 26th - 27th of May 2021

Dear users,

we will do a software and hardware maintenance of the Yggdrasil HPC cluster
on 26th - 27th of May 2021

The maintenance will start at 08:00 +0100 and you will receive an email
when the maintenance will be over.

The cluster will be totally unavailable during this period, with no
access at all (not even to retrieve files).

If you submit a job in the meantime, be sure that the expected wall time
(duration) does not overlap with the start of the maintenance or your
job will be scheduled after the maintenance.

What should be done during this maintenance:

  1. OS upgrade
  2. Login1 re installation
  3. Slurm update
  4. various

Thanks for your understanding.

Best regards,
the HPC team

Dear users,

the maintenance is now over.

What was done:

  • OS upgrade (security and bug fix)
  • Login1 re installation (disk partitioning)
  • Slurm update : 20.11.7 NVML support activation (this is for us)
  • private /tmp, /dev/shm, /var/tmp, and /scratch on compute nodes.
  • re-classify the compute nodes in unified Vx numbers
  • various

We had an issue during the maintenance: some jobs were released too early (I hope nobody will complain about that!) and we decided to let them run until the end. For this reason we’ll reinstall the remaining nodes today, don’t be surprised if you see compute nodes in drain for this reason.

Best regards

HPC team