Primary informations
Username: coppinp
Cluster: Yggdrasil
Description
Jobs crash because certain nodes cannot read from /cvfms
Two nodes which I identified were cpu97 and cpu107
(though I did not check them all individually)
I know you are all busy with the Baobab upgrade. This is a non-urgent issue as problem nodes can very easily be excluded. Just putting this here while I remember
Steps to Reproduce
Run a job on cpu097 or cpu107 in which the bash script contains the line:
source /cvmfs/dampe.cern.ch/centos7/etc/setup_conda_python2.7_tensorflow2.1.sh
Expected Result
Set environment variables
Actual Result
Wed Nov 22 16:34:35 CET 2023 - This is cpu107.yggdrasil, executing task
/var/spool/slurmd/job29506985/slurm_script: line 72: /cvmfs/dampe.cern.ch/centos7/etc/setup_conda_python2.7_tensorflow2.1.sh: No such file or directory