Yggdrasil scheduled maintenance: 24-25 May 2023

Dear users,

as just announced on the baobab-announce@ mailing list, we will do a software and hardware maintenance of the Yggdrasil HPC cluster on 24 and 25 May 2023.

The maintenance will start at 08:00 +0100 and you will receive an email when the maintenance will be over.

The cluster will be totally unavailable during this period, with no access at all (not even to retrieve files).

If you submit a job in the meantime, be sure that the expected wall time (duration) does not overlap with the start of the maintenance or your job will be scheduled after the maintenance.

What should be done during this maintenance:

  1. reinstall all the node with Rocky8 (major upgrade from CentOS7)
  2. reinstall login node with Rocky8
  3. upgrade Slurm to version 23.0.2 (major upgrade from Slurm 22.x)
  4. upgrade BeeGFS to 7.2.9
  5. upgrade Mellanox to 4.9.6
  6. upgrade servers with latest bug and security fix
  7. replace faulty disks on storage servers

Thanks for your understanding.

Best regards,
the HPC team

OK, that’s a bit short on notice (usually there was a first announcement of maintenance 2 weeks in advance, which wasn’t done this time). I do have a large job array running (submitted 1 week ago, with individual job wall-time limits of 2 days.) it is done ~5/6. Because there wasn’t any earlier announcement, I can’t guarantee that it will finish before May 24. I cancelled all other submitted but non-started job arrays. Do you have any advice how to deal will the array, where the majority is already run?

Btw., would it be possible, to update gnuplot from 4.6 (this version is already 9 years old) to a recent version e.g. 5.4?

All best,
Matthias Kruckow

Sorry for that, indeed we try to announce the maintenance one month in advance, we forgot to do so

This isn’t an issue as any running job will be finished until the start of the maintenance as they have only 2 days wall time. Even if not all the jobs in your job array finished, this isn’t an issue and they’ll continue once the maintenance is done.

This wasn’t needed. When we plan a maintenance we create a reservation for that that prevents any job to run during the maintenance: you don’t have to worry about that.

Unfortunately there is no generic trick to handle that.

As we’ll upgrade Yggdrasil to Rocky8, gnuplot will be version 5.2 then. You can as well use the version we provide through module which is even newer:

(baobab)-[sagon@login2 ~]$ ml spider gnuplot

      Portable interactive, function plotting utility

1 Like

sorry to bother you, but yggdrasil is still not available now (26th), is it still in maintenance and if yes do you know approx. when it will be accessible again (i need to transfer files) ?

Best regards

Dear all the maintenance is now over, thanks for your patience.

Do not hesitate to let us know if you see something unusual after the maintenance.