I/O errors on Baobab home directory

Primary informations

Username: serpolla
Cluster: Baobab

Description

There seems to be some problem with the home directory.
Many files are impossible to read due to I/O errors.

Same exact problem here, I created an issue already and called up Yann to no response, I can login but files I’ve created aren’t readable,

Remote I/O error likely means something is wrong with the filesystem, @Yann.Sagon au secours !

I have also seen this error on baobab.

$ ssh cardenac@login1.baobab.hpc.unige.ch
(cardenac@login1.baobab.hpc.unige.ch) Password: 
Last login: Sun Dec 29 14:51:54 2024 from app6.baobab
 ____              _           _
|  _ \            | |         | |
| |_) | __ _  ___ | |__   __ _| |__
|  _ < / _` |/ _ \| '_ \ / _` | '_ \
| |_) | (_| | (_) | |_) | (_| | |_) |
|____/ \__,_|\___/|_.__/ \__,_|_.__/
             _             _      __ 
            | |           (_)    /_ |
            | | ___   __ _ _ _ __ | |
            | |/ _ \ / _` | | '_ \| |
            | | (_) | (_| | | | | | |
            |_|\___/ \__, |_|_| |_|_|
                      __/ |          
                      |___/  

 Documentation: https://doc.eresearch.unige.ch/hpc/start
 Forum: https://hpc-community.unige.ch/
 OpenOndemand: https://openondemand.baobab.hpc.unige.ch/
 support: https://doc.eresearch.unige.ch/hpc/start#support_-_get_help


-bash: /home/users/c/cardenac/.bash_profile: Remote I/O error
(baobab)-[cardenac@login1 ~]$ 

But bamboo looks fine:

~$ ssh cardenac@login1.bamboo.hpc.unige.ch
(cardenac@login1.bamboo.hpc.unige.ch) Password: 
Last login: Sun Dec 29 13:50:50 2024 from 77-59-137-204.dclient.hispeed.ch
 ____                  _
|  _ \                | |
| |_) | __ _ _ __ ___ | |__   ___   ___
|  _ < / _` | '_ ` _ \| '_ \ / _ \ / _ \
| |_) | (_| | | | | | | |_) | (_) | (_) |
|____/ \__,_|_| |_| |_|_.__/ \___/ \___/
                 _             _      __ 
                | |           (_)    /_ |
                | | ___   __ _ _ _ __ | |
                | |/ _ \ / _` | | '_ \| |
                | | (_) | (_| | | | | | |
                |_|\___/ \__, |_|_| |_|_|
                          __/ |          
                         |___/  

 Documentation: https://doc.eresearch.unige.ch/hpc/start
 Forum: https://hpc-community.unige.ch/
 OpenOndemand: https://openondemand.baobab.hpc.unige.ch/
 support: https://doc.eresearch.unige.ch/hpc/start#support_-_get_help


(base) (bamboo)-[cardenac@login1 ~]$ 

I’m not sure if its connected, I was trying to run a quick debug script on the debug-cpus (PID 14190096 & 14190098) but the slurm file retuns an empty slurm-*.out file.

seff shows a State: FAILED (exit code 15) for both.

Hi,

I’ll check and let you know.

The problem has been solved, I will not investigate further due to the vacation week.

We’ll be back on January 3, 2025. :tada:

1 Like

Thank you @Adrien.Albert !!