Baobab scheduled maintenance: 03-04 March 2021

Dear users,

as just announced on the baobab-announce@ mailing list, we will do
a software and hardware maintenance of the Baobab HPC cluster on
03-04 March 2021.

The maintenance will start at 08:00 +0100 and you will receive an email
when the maintenance will be over.

The cluster will be totally unavailable during this period, with no
access at all (not even to retrieve files).

If you submit a job in the meantime, be sure that the expected wall time
(duration) does not overlap with the start of the maintenance or your
job will be scheduled after the maintenance.

What should be done during this maintenance:

  1. hardware maintenance (batteries for RAID controllers)
  2. software OS, Slurm upgrade
  3. delete old partitions name (with -EL7 suffix name)

Thanks for your understanding.

Best regards,
the HPC team

Dear users,

the maintenance is now over!

What was done:

  • OS updated: to version CentOS7.9 2009
  • Slurm updated: to version 20.11.3
    • enabled interactive steps: [hpc:slurm [eResearch Doc] ](https://doc.eresearch.unige.ch
      /hpc/slurm#interactive_jobs)
    • new version of spart
    • removed legacy partition name (with -EL7). See details.
  • changed RAID batteries on two servers.
  • Put in production two new GPU servers.
  • Updated BIOS on two GPU servers that had issues.
  • Re installation of all the CPU and GPU nodes.
  • Migrated data, hopefully this should be transparent for you.
  • Many other fixes, cleanup etc, only because we like to do that:)

Enjoy the cluster and have a nice day.

HPC team,

Luca, Massimo, Rémy, Yann