I am unable to create any new file on Yggdrasil scratch. My colleagues are experiencing the same issue.
Steps to Reproduce
Create dummy file
Produces the following error:
touch: setting times of ‘test_file’: Remote I/O error
I was just writing a report myself, but you have been faster.
The problem is for reading files as well and happens for other users, too.
Last times this happened the file system got restarted. But when checking the list of issues, it looks like the same issue appears again and again, while it feels like the time between two cases shortens, thus it happens more and more often.
Hence, we need a more robust solution then every time restarting the file system for the future. Please, HPC team try to investigate this and find a permanent solution.
Thanks for reporting this issue, after checking the scratch2 server was on error due to disk failed. Server is now working again after reboot. I will create a ticket provider side to report this hardware problem.
Unfortunately, the I/O errors are back again.
An important hardware problem occurs on scratch Yggdrasil.
At this time, I just restarted the fs.
You can now work as usual, then I will continue to analyze logs to retrace the problem source.
I wanted to notify that I keep getting errors “Communication error on send” on Yggdrasil although it is mentioned in this post that the issue was resolved.
Thank you for your help,
You’re right we had another issue this time on scratch1 server. I update the post just now.
Thank you for your answer.
I wanted to notify again that, although the issue seemed resolved in the past days, I am getting again errors such as “OSError: [Errno 121] Remote I/O error”.
This time the error wasn’t hardware, but users having too many files hosted on the scratch space: New scratch policy : quota on number of files
3 posts were split to a new topic: Issue with storage