as just announced on the baobab-announce@ mailing list, we will do a software and hardware maintenance of the Baobab HPC cluster on 22 and 23 November 2023.
The maintenance will start at 08:00 +0100 and you will receive an email when the maintenance will be over.
The cluster will be totally unavailable during this period, with no access at all (not even to retrieve files).
If you submit a job in the meantime, be sure that the expected wall time (duration) does not overlap with the start of the maintenance or your job will be scheduled after the maintenance.
What should be done during this maintenance:
- Increase disk space on
- several hardware stuff (replace battery, fan, disks)
- better spread storage servers on our Infiniband switches to enhance the load balancing and minimize the network congestion
- update BeeGFS to 7.4.1
- re install Slurm server to Rocky8 and update version to 23.02.06
- re install all the nodes with latest Rocky8 (8.8)
- Upgrade the servers with latest security and bug fix
Thanks for your understanding.
the HPC team