Hi there,
NB, this seems exactly the same problem as Issue with GPU on CentOS7 .
The CUDA upstream deviceQuery
does not report any error, test case available at https://gitlab.unige.ch/hpc/softs/tree/ff8b7626113206871ad380ad496b327bc8fa7aa8/c/cuda (launched on gpu010/Slurm-18517239, gpu009/Slurm-18517272 and gpu008/Slurm-18517273 ).
Now back to pythorch:
- a simple
works with module:PyTorch/0.3.0-Python-3.6.4, test case available at https://gitlab.unige.ch/hpc/softs/tree/3de4a730f5d8c617e2586fda7058bb7ae0eeb66b/p/pytorch (launched on gpu010/Slurm-18574803, gpu009/Slurm-18574804 and gpu008/Slurm-18574805). - your Pytorch in Singularity test works as well, test case available at https://gitlab.unige.ch/hpc/softs/commit/b9973e982654776742faefd79f016777e9ad56e6 (launched on gpu010/Slurm-18693217, gpu009/Slurm-18693287 and gpu008/Slurm-18693288, after having built the image as you suggested).
@Pablo.Strasser , can you please test again with a clean build, please?
Thx, bye,