Hello HPC team,
I am currently on gpu002 on baobab, trying to run a script. It seems to keep failing with the error OSError: [Errno 28] No space left on device
.
I have tried to clean up my scratch as much as possible. Is this something from my end still? If so, how may I go about sorting this issue out?
2 Likes
Hi all,
I am facing the same issue. The scratch partition is full (49G
available when running df -h
, which I guess is some spare space).
Best,
Brian
Hi,
The BeeGFS scratch directory is almost full, with only 49GB remaining.
(baobab)-[root@gpu002 ~]$ df -t beegfs -h
Filesystem Size Used Avail Use% Mounted on
beegfs_dpnc 503T 3.6T 499T 1% /srv/beegfs/dpnc
beegfs_home 138T 119T 20T 86% /home
beegfs_scratch 1.5P 1.5P 49G 100% /srv/beegfs/scratch
We kindly ask everyone reading this message to clean up the scratch directory as soon as possible. A formal communication will be sent tomorrow, but immediate action is necessary to free up space.
1 Like
Hi Adrien,
Thanks, I am trying to clean up as much as feasible. I have also noticed my trainings running considerably slower, which involves i/o. I was wondering if this was related to the scratch being saturated?
Dear @Debajyoti.Sengupta
probably yes. How many space do you need on the storage? Iām asking because you have several other alternatives:
- Use Bamboo, its home storage is much faster than other storage we provide, but limited to 1TB
- Request access to our fast storage on Baobab, very limited in size and no backup at all. Only for temporary data
- Use local storage on compute node. Only for ephemeral (job duration)
Check here for more details.
Best