All my ongoing jobs suddenly crashed and I cannot access Bamboo via Filezilla anymore. Instead, I’m sent to a different folder which contains the following subfolders
Second this: VSCode Remote - SSH login reports the following error:
Could not chdir to home directory <$HOME> Remote I/O error
-bash <$HOME>/.bash_profile: Remote I/O error
EDIT: Login possible again !
Weirdly, I seem to have recovered my access both via Putty and FileZilla + possibility to launch jobs again and no way to know what happened nor why my scripts were abruptly stopped….
Side note: it’s not very fair to bill for the hours when we run the cluster when our jobs are stopped due to cluster issues. This has happened to me more than once and it would be nice to have a way to cancel these hours when it happens in the future…
Hello,
We didn’t get any faillure on Bamboo, I’m sorry to see your message. I check log and I wee 10 minutes downtime on storage servers.
We will investigate regarding this problem, sorry for inconvenience.
Best regards,
thanks yes indeed it was super short but all jobs crashed abruptly and then it came back super quickly but in the meantime I still had to restart all the jobs that had crashed
I think that it just happened again… This time I didn’t had any batch ongoing, but it’s totally impossible to connect to Bamboo since this morning
Yes i have the same problem. It starts yesterday night, all of my job stop suddenly, and when i try some commands i get this answer “Remote I/O error” . Now i can’t even connect to my session, it says “ ssh: connect to host login1.bamboo.hpc.unige.ch port 22: Connection timed out “
Just echoing the problem. I am also unable to ssh into bamboo; I get a port 22 time out error.
Got the same problem now.
Same here, i cannot connect to bamboo anymore (although i can still access other clusters): ssh: connect to host login1.bamboo.hpc.unige.ch port 22: Operation timed out
the issue still seems ongoing so I guess more news next Monday when people are back from vacations
it’s working again on my side!
Actually no, only the access is back to work (and the capacity to launch jobs) but then they crash immediately with the following error message:
HDF5-DIAG: Error detected in HDF5 (1.8.12) thread 0:
#000: H5Dio.c line 179 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 547 in H5D__read(): can’t read data
major: Dataset
minor: Read failed
#002: H5Dchunk.c line 1836 in H5D__chunk_read(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#003: H5Dchunk.c line 2862 in H5D__chunk_lock(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#004: H5Fio.c line 113 in H5F_block_read(): read through metadata accumulator failed
major: Low-level I/O
minor: Read failed
#005: H5Faccum.c line 258 in H5F_accum_read(): driver read request failed
major: Low-level I/O
minor: Read failed
#006: H5FDint.c line 142 in H5FD_read(): driver read request failed
major: Virtual File Layer
minor: Read failed
#007: H5FDsec2.c line 725 in H5FD_sec2_read(): file read failed: time = Mon Jan 5 09:30:30 2026
, filename = ‘/srv/beegfs/scratch/users/c/clairis/fMRI_analysis/results/CAPS/CAPS/CAPS_SeedFree1c_K2__GM50.mat’, file descriptor = 787, errno = 121, error message = ‘Remote I/O error’, buf = 0x14d4e6778fa0, total read size = 500, bytes this sub-read = 500, bytes actually read = 18446744073709551615, offset = 523929
major: Low-level I/O
minor: Read failed
so I’m guessing it’s not fully fixed yet. The issue seems to be here [2026] Current issues on HPC Cluster - #3 by Yann.Sagon and is written as still ongoing so gonna wait for the announcement of the fix before trying again
Dear all, we returned from vacation today, this is our first issue of the year
!
It is now solved. [2026] Current issues on HPC Cluster

