Baobab scheduled maintenance: 30th of June - 01st of July 2021

Dear users,

as just announced on the baobab-announce@ mailing list, we will do a software and hardware maintenance of the Baobab HPC cluster
on 30th of June - 01st of July 2021

The maintenance will start at 08:00 +0100 and you will receive an email
when the maintenance will be over.

The cluster will be totally unavailable during this period, with no
access at all (not even to retrieve files).

If you submit a job in the meantime, be sure that the expected wall time
(duration) does not overlap with the start of the maintenance or your
job will be scheduled after the maintenance.

What should be done during this maintenance:

  1. Re factor Baobab “master” (the central Baobab server) to a new server “admin1” using CentOS7
  2. Upgrade BeeGFS home servers to CentOS7
  3. Upgrade Slurm to version 20.11.7
  4. De activate MIG on A100 card
  5. Re install all the compute nodes latest bugfix.

As most of the tasks will depend on having a working “admin1” server, the tasks execution will be performed if possible.

Thanks for your understanding.

Best regards,
the HPC team

Dear users,

the maintenance is now over!

What was done during this maintenance:

  • Upgrade Slurm to version 20.11.7
  • De activate MIG on A100 card on GPU020
  • Enable local private space per job on compute nodes: /dev/shm, /tmp, /scratch
  • Re install all the compute nodes with latest bugfix.
  • Re installation of all the BeeGFS home servers in CentOS7 and perform a filesystem check
  • various fixes

Thanks for your feedback in case you notice something weird or just because you wan to say hello:)

Best regards,

the HPC team

1 Like

Thank you!

p.s. do we have Stata 17 on Baobab and/or on Yggdrasil?

Thanks a lot

Yes we do: New software installed Stata 17 MP32

Best

Yann