as just announced on the baobab-announce@ mailing list, we will do a software and hardware maintenance of the Baobab HPC cluster
on 30th of June - 01st of July 2021
The maintenance will start at 08:00 +0100 and you will receive an email
when the maintenance will be over.
The cluster will be totally unavailable during this period, with no
access at all (not even to retrieve files).
If you submit a job in the meantime, be sure that the expected wall time
(duration) does not overlap with the start of the maintenance or your
job will be scheduled after the maintenance.
What should be done during this maintenance:
- Re factor Baobab “master” (the central Baobab server) to a new server “admin1” using CentOS7
- Upgrade BeeGFS home servers to CentOS7
- Upgrade Slurm to version 20.11.7
- De activate MIG on A100 card
- Re install all the compute nodes latest bugfix.
As most of the tasks will depend on having a working “admin1” server, the tasks execution will be performed if possible.
Thanks for your understanding.
the HPC team