Baobab scheduled maintenance for the 02nd and 03rd of May

Dear users,

we will do a maintenance of the cluster the 02nd and 03rd of May.

The cluster will be totally unavailable during this period, this include accessing your files.

What should be done during this maintenance:

  1. Decommission of CUI’s Scilla cluster and migrate 8 of it’s compute nodes to Baobab.

  2. Continue the Installation of two scratch storage servers with a capacity of 1TB.

  3. Do some cabling changes (ethernet, infiniband, power).

  4. Install a GPU compute node.

  5. software upgrade.

  6. continue migrating compute nodes from CentOS 6 to CentOS 7.

:warning: IMPORTANT :warning:

You are probably aware that we are migrating Baobab from CentOS 6 to CentOS 7.

During this migration we will migrate all the nodes from the partition “parallel”, “mono”, “debug” and “bigmem” to CentOS7. The concerned partition will be renamed as follow:

parallel  => parallel-EL7
mono      => mono-EL7
debug     => debug-EL7
bigmem    => bigmem-EL7

To submit jobs to those partitions and the ones already migrated, it’s best to connect to the 2nd login node which is already installed with CentOS 7 or you may have to face some side effect. The address is baobab2.hpc.unige.ch

Without further notice from users about issues related to this migration, we will migrate all the remaining nodes to CentOS 7 starting from 16th of May.

Your HPC team

Dear Baobab users,

the maintenance is now over.

Here is what was done:

  • the biggest part of the maintenance was to move compute nodes, switches, etc to optimize their placement in the Baobab’s racks
  • we have added 8 compute nodes from CUI cluster Scylla that we decommissioned today.
  • we have reinstalled every node to a fresh state with the latest CentOS 6 and CentOS 7.
  • we have migrated every node from the parallel and mono partition to CentOS 7 as announced

The cluster is now composed of nodes installed with CentOS 6 and nodes installed with CentOS 7.
The partition with nodes installed in CentOS 7 are suffixed with “-EL7”

IMPORTANT:
You must choose the correct login node:

  • To submit a job to a CentOS 6 node, you must submit your job from baobab.unige.ch (login1).
  • To submit a job to a CentOS 7 node, you must submit your job from baobab2.hpc.unige.ch (login2).

If you had pending jobs using mono, parallel, bigmem or shared-bigmem partitions, you have to cancel them as those partitions doesn’t exist anymore.

Roadmap:

  • The fact that you can’t use both login node for all the nodes will be changed soon.
  • We will finish the installation of a big scratch space (1PB) to be used for short term storage.

Not everything is in the cloud!

Luca at work


Baobab rear side during the maintenance