Baobab scheduled maintenance: 25-26 November 2020

Dear users,

as just announced on the baobab-announce@ mailing list, we will do a software and hardware maintenance of the Baobab HPC cluster on Wednesday 25 November 2020 and Thursday 26 November 2020.

The maintenance will start at 08:00 +0100 and you will receive an email when the maintenance will be over.

The cluster will be totally unavailable during this period, with no access at all (not even to retrieve files).

If you submit a job in the meantime, be sure that the expected wall time (duration) does not overlap with the start of the maintenance or your job will be scheduled after the maintenance.

What should be done during this maintenance:

  1. hardware maintenance (electrical power and network)
  2. software upgrades (OS, Slurm plugins, etc.)

Thanks for your understanding.

Best regards,
the HPC team

Dear HPC users,

the maintenance is now over!

What’s new in Baobab:

  • Slurm 20.02.6
  • easier to launch remote graphical software through Slurm using : “salloc --x11”.
    We’ll create a topic on hpc-community about that and update the doc accordingly
  • new GPGPU nodes installed in the datacentre: gpu015 and gpu016: 128 AMD CPU and 8 x RTX2080Ti on each servers.
  • BeeGFS 7.1.5
  • CentOS 7.9.2009
  • All the compute nodes reinstalled as usual
  • power cabling re organisation to get enough power for the new compute nodes
  • various bug fix and scriptings

During this maintenance, we have introduced a simplified naming scheme for Slurm partitions.

This is important as you will need to update your sbatch scripts before February 2021 (next Baobab maintenance).

For more information, please visit : Simplified partition naming scheme

For this maintenance we were helped from Rémy (astro dept.) thanks for the help.

Best regards

HPC team
Yann, Luca, Massimo