Click with the left button of your mouse on a dot on the graphic (1) and select by which metric you want to drill-down the data. In this example, we’ll do it by PI (2). You can of course select another field if you are interested for example by usage per user.
We are interested by the usage of one PI. To filter the data, click on the filter icon (1) and select the PI (2) you are interested by and validate (3). As the list may be long you can use the search toolbox for easier lookup.
By default the agregation unit is aumatic and depend on the time frame you are looking at. You can change the aggregation unit manually, for example to see your past usage with a sum by month. (1) and (2)
Don’t forget that your users are using CPU’s and probably GPU’s too. In this case, you need to start a second time the procedure with the GPU metric to get the GPU usage.
I was wondering if we could have access to the number of users for one PI too (like when there are a lot of hours, is it because there were more people?)
Is it possible to have a cumulative plot on the interface, or we just have to download the data and do this on our own?
Finally, I was wondering if you would have an estimate of the “cost” of a CPU hour, in terms of electrical consumption (to be able to compute a corresponding CO2 emission!). I’m wondering if a rough order of magnitude could be estimated by looking at, per year, the consumption and just do “total electrical consumption/CPU+GPU hours”? (that’s assuming GPU and CPU hours are similar, which I guess not, but…)
Anyway, thanks again for the tutorial, very useful!
Yes, it is possible to have the detail per user of a given PI. After viewing the usage per PI, just left click on the graph again (step1) and select user entry (step2).
Thank you so much @Yann.Sagon
That helps a lot!
Looking forward to the power consumption estimates! In the meantime, I’m using 150W (hopefully it’s a good ballpark estimate).
For the cumulative plot, there’s no worries, I’ve done it myself with the table from the webpage.
I’m experiencing difficulty accessing opxenxdmod. The page isn’t loading for me in several browsers (Brave and Firefox on Arch Linux, and Brave on mobile) across different networks (UNIGE and 5G).
I wanted to have a look on opxenxdmod to monitor core usage during my workloads, which could help me optimize resource requests. htop isn’t providing the level of detail I need.
Could you please let me know if others are experiencing similar issues or provide a fix? Thank you for your time and assistance. And if openxdmod is reserved to PI, could you please mention that in the docs ?
We were adding SSO (single sign on, to use your UNIGE ISIS account) authentification to openxdmod the past days and this has unfortunately created some instability. The instance is now working again, and as extra bonus you can now login to it with your account in case you want to create custom reports that you want to save for example.
OpenXDMoD is useful to see your past usage only, it isn’t working as realtime analytic.
Thanks a lot for the useful answer. I finally ssh into the compute node and used htop there. I also added some logging tools in my python script to get more info if I can not look into htop regularly.
By the way, seff provided me not useful info at all. Htop on the compute node :
Good afternoon Yann. For your info (and not getting bashed about irresponsible resource usage).
It was at the beginning of my script, but it ramps up until almost 1000 GB of RAM is used (I still store chunks in the ROM while the RAM is full).
As for CPU usage, it is very intermittent (maybe because of I/O limitations ?), even after setting up Dask for hyperthreading. It is also recommended in the HPC docs to request the full available cpus in a node if the whole RAM is needed, which makes sense to me, especially in the bigmem partition.
No bashing intended! I was asking just in case it wasn’t intentional but it seems it is needed, so perfect like that.
If you use the whole RAM of a compute node, the justification of using all the CPUs is that anyway nobody will be able to use the remaining CPUs. In this case, if your job can benefit to have more CPU, better request them. If your job uses only let say two cores and adding more CPU doesn’t speedup things, it may be better to not request more CPUs than needed as next year we’ll bill by CPU usage.