It seems you have been misinformed, we spent several hours in October with someone from your group and we were open minded to the proposals.
- We talked about the fairshare issue that is being resolved.
- We have found different solutions to meet their needs. (like reservations on your private partition)
- We have also contacted all reported or identified users who are not using best practices.
- We deployed new sysadmin tools to be more reactive. Storage performance issues and login node load issue have been greatly reduced.
- Last week 22 gpu cards have been installed on baobab increasing the availability: New computer installed gpu[032-035].baobab
- GPU are now allocated based on their compute capacity (low end models are allocated first) Baobab scheduled maintenance: 28-29 September 2022 - #4 by Yann.Sagon
- Recently, we informed some users by email of a way to restrict their jobs to use specific gpu nodes according to their needs. (More information here)
Next actions:
- Find a way to limit the number of GPU cards used at the same time per user.
- Installation of newly ordered GPUs on Baobab
- New cluster installation: Bamboo
This is your interpretation but not the reality. You are working on an academic cluster, some of users
does not have HPC knowledge and learn by trying, so they don’t have any bad intentions. For the past 2 months, new HPC training workshops have been created and provided by the SciCos team to educate new users.(Teaching & workshops - SciCoS - Scientific Computing Support - UNIGE) If I remember well, this has been discussed with people of your group.
With all this elements, I do not consider that we refused to take any action. We always emphasize the fair use of the cluster considering the private nodes and their priorities.
For any questions/suggestions/issues, feel free to join the hpc-lunch meeting held every first Thursday of the month (more information here)
Best regards,
Your lovely HPC team