Baobab: Login node down
Dear users,
The login node on Baobab have crashed. The server have been rebooted and is available again.
We apologize for any inconveniance caused
Thank you for your understanding.
Status : Solved
start: 2024-07-21T18:42:00Z
end:Invalid date
1 Like
Bamboo Scratch Storage Unavailable
Dear HPC Users,
The scratch storage on Bamboo is currently unavailable due to an ongoing issue. Our team has already contacted the provider and we are actively working with them to resolve the situation as quickly as possible.
Please note that the scratch storage have been unmounted on compute and login nodes and will remain unavailable until further notice. We will keep you updated as soon as we have more information on the situation.
Thank you for your understanding,
Best Regards,
Status : Solved
start: 2024-09-10T22:33:00Z
end:2024-09-26T07:33:00Z
Update: the vendor will do an intervention the 25th of September to fix the issue.
The service is back in production without data loss!
Yggdrasil nodes unavailable
Dear HPC Users,
Yggdrasil is currently experiencing issues with its electrical power supply, which has resulted in a reduced number of available nodes on the cluster.
Electricians are working to resolve the issue.
Thank you for your understanding.
Best Regards,
Status : Solved
start: 2024-09-13T21:30:00Z
end: 2024-09-17T12:24:00Z
Dear HPC Users,
Yggdrasil is currently experiencing issues with its electrical power supply, which has resulted in a reduced number of available nodes on the cluster.
Same issue as mid September. We’ll check with the datacenter manager what is going on.
Thank you for your understanding.
Best Regards,
Status : Partially solved
start: 2024-09-27T22:02:00Z
stop: 2024-09-30T09:45:00Z
edit: Electrical cabling was modified wrongly on Yggdrasil without notice to us by someone at Astro. Astro IT team is reverting the change. This is a partial workaround as it appears we still have an overload issue that has to be solved.
Dear HPC Users,
We’ve set all the nodes in drain in every cluster. As we have an issue with scratch storage, we need to upgrade scripts on every node. No worries, as soon as a node is upgraded, we’ll resume it.
Thank you for your understanding.
Best Regards,
Status : In progress
start: 2024-09-29T22:02:00Z