Hello,
Since very recently, I am encountering the following error and difficulty for filezilla to retrieve the directory listing while job is running:
srun: error: eio_handle_mainloop: Abandoning IO 60 secs after job shutdown initiated
My code runs a for loop and this occurs after some time for a given iteration and does not complete.
What I have found on the net is that this error occurs when:
“Slurm is giving up waiting for stdout/stderr to finish. This typically happens when some rank ends early while others are still wanting to write. If you don’t get complete stdout/stderr from the job, please resubmit the job.”
but I do not understand what it means or how to fix it.
This is the sh file I run on yggdrasil:
#!/bin/sh
#SBATCH --cpus-per-task=1
#SBATCH --job-name=smoothed
#SBATCH --ntasks=1
#SBATCH --time=01:30:00
#SBATCH --array=1-9999
#SBATCH --partition=shared-cpu
#SBATCH --mail-type=ALL
#SBATCH --mail-user=younes.boulaguiem@unige.ch
## deps
module load foss/2019b R/3.6.2
## main
srun Rscript smoothed_HPC.R $SLURM_ARRAY_TASK_ID
I would greatly appreciate your help!
Thanks in advance,
Younes