/cvmfs not found on certain nodes

Primary informations

Username: coppinp
Cluster: Yggdrasil

Description

Dear HPC, since a few days certain jobs instantly fail, when trying to source the software environment of the job, because they cannot read the /cvmfs directory. Specifically, I run the command

source /cvmfs/dampe.cern.ch/centos7/etc/setup_conda_python2.7_tensorflow2.1.sh

On most nodes, this works without errors, as expected, but certain nodes produce the error:

/var/spool/slurmd/job23269249/slurm_script: line 14: /cvmfs/dampe.cern.ch/centos7/etc/setup_conda_python2.7_tensorflow2.1.sh: No such file or directory

Below is a non-complete list of nodes on which the errors occurs:
cpu017,cpu018,cpu020,cpu021,cpu023,cpu024,cpu022,cpu025,cpu026,cpu036,cpu081,cpu082
While a few examples of nodes that work fine are:
cpu106,cpu107,cpu073

Hi,

FIXED: the cvmf FS have been mounted, you shoul get access to it:

(yggdrasil)-[root@login1~]$ clush -bw @compute ls  /cvmfs/dampe.cern.ch/centos7/etc/setup_conda_python2.7_tensorflow2.1.sh
---------------
cpu[001-018,020-082,085-111,113-120,122-150],gpu[001-002,005-008] (151)
---------------
/cvmfs/dampe.cern.ch/centos7/etc/setup_conda_python2.7_tensorflow2.1.sh
1 Like