Hello,
I’m trying to use torchvision for a project, but I encounter a problem while loading it in my python script.
I load the following modules:
module load GCC/6.4.0-2.28 OpenMPI/2.1.2 Python/3.8.2 PyTorch/0.3.0-Python-3.6.4 GCCcore/9.3.0 torchvision/0.2.1-Python-3.6.4
But my job always end with an error when I try to load Torchvision in my script:
Inactive Modules:
- hwloc/1.11.8 2) numactl/2.0.11
Due to MODULEPATH changes, the following have been reloaded:
- binutils/2.28
The following have been reloaded with a version change:
- CUDA/10.1.243 => CUDA/9.1.85 2) GCCcore/6.4.0 => GCCcore/9.3.0
Traceback (most recent call last):
File “NN.py”, line 3, in
import torchvision.models as models
File “/opt/ebsofts/MPI/GCC/6.4.0-2.28/OpenMPI/2.1.2/torchvision/0.2.1-Python-3.6.4/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/init.py”, line 2, in
File “/opt/ebsofts/MPI/GCC/6.4.0-2.28/OpenMPI/2.1.2/torchvision/0.2.1-Python-3.6.4/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/datasets/init.py”, line 1, in
File “/opt/ebsofts/MPI/GCC/6.4.0-2.28/OpenMPI/2.1.2/torchvision/0.2.1-Python-3.6.4/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/datasets/lsun.py”, line 2, in
File “/opt/ebsofts/MPI/GCC/6.4.0-2.28/OpenMPI/2.1.2/torchvision/0.2.1-Python-3.6.4/lib/python3.6/site-packages/Pillow-5.1.0-py3.6-linux-x86_64.egg/PIL/Image.py”, line 60, in
File “/opt/ebsofts/MPI/GCC/6.4.0-2.28/OpenMPI/2.1.2/torchvision/0.2.1-Python-3.6.4/lib/python3.6/site-packages/Pillow-5.1.0-py3.6-linux-x86_64.egg/PIL/_imaging.py”, line 7, in
File “/opt/ebsofts/MPI/GCC/6.4.0-2.28/OpenMPI/2.1.2/torchvision/0.2.1-Python-3.6.4/lib/python3.6/site-packages/Pillow-5.1.0-py3.6-linux-x86_64.egg/PIL/_imaging.py”, line 6, in bootstrap
File “/opt/ebsofts/MPI/GCC/6.4.0-2.28/OpenMPI/2.1.2/Python/3.6.4/lib/python3.6/imp.py”, line 343, in load_dynamic
return _load(spec)
ImportError: libtiff.so.3: cannot open shared object file: No such file or directory
srun: error: node001: task 0: Exited with exit code 1
I tried to change the version of python 3 used, but it didn’t chage the result.
Bests,
Guy-Raphaël Stauffer
Hi there,
Have you loaded one of the libtiff
modules?
capello@login2:~$ module spider libtiff
----------------------------------------------------------------------------------
LibTIFF:
----------------------------------------------------------------------------------
Description:
tiff: Library and tools for reading and writing TIFF data files
Versions:
LibTIFF/4.0.4
LibTIFF/4.0.6
LibTIFF/4.0.7
LibTIFF/4.0.8
LibTIFF/4.0.9
LibTIFF/4.0.10
LibTIFF/4.1.0
[...]
capello@login2:~$
Given your toolchain (foss/2018a
), you should load LibTIFF/4.0.9
:
capello@login2:~$ module load foss/2018a
capello@login2:~$ module load LibTIFF/4.0.9
capello@login2:~$ module list
Currently Loaded Modules:
1) GCCcore/6.4.0 6) OpenMPI/2.1.2 11) zlib/1.2.8
2) binutils/2.28 7) OpenBLAS/0.2.20 12) XZ/5.2.2
3) GCC/6.4.0-2.28 8) FFTW/3.3.7 13) libxml2/2.9.8
4) numactl/2.0.11 9) ScaLAPACK/2.0.2-OpenBLAS-0.2.20 14) LibTIFF/4.0.9
5) hwloc/1.11.8 10) foss/2018a
capello@login2:~$
However, this module does not provide libtiff.so.3
, but a more recent one:
capello@login2:~$ module show LibTIFF/4.0.9 2>&1 | \
grep LIBRARY
prepend_path("LD_LIBRARY_PATH","/opt/ebsofts/Compiler/GCCcore/6.4.0/LibTIFF/4.0.9/lib")
prepend_path("LIBRARY_PATH","/opt/ebsofts/Compiler/GCCcore/6.4.0/LibTIFF/4.0.9/lib")
capello@login2:~$ find /opt/ebsofts/Compiler/GCCcore/6.4.0/LibTIFF/4.0.9/lib -type f -name libtiff.so.3
capello@login2:~$ find /opt/ebsofts/Compiler/GCCcore/6.4.0/LibTIFF/4.0.9/lib -type f -name libtiff.so\*
/opt/ebsofts/Compiler/GCCcore/6.4.0/LibTIFF/4.0.9/lib/libtiff.so.5.3.0
capello@login2:~$
We need to recompile some modules, I will come back to you ASAP.
Thx, bye,
Luca
Thank You for the quick answer.
I tried to load one libtiff module, but, as you said, it didn’t provide libtiff.so.3, so the error message didn’t change.
bests,
Guy-Raphaël Stauffer
Hello,
is there any news on the modules recompilations ?
bests,
Guy-Raphaël Stauffer
Hi there,
Sorry for the delay, the error you get can be “solved” simply loading Pillow/5.0.0-Python-3.6.4
(which has a correct module
dependency on libTIFF/4.0.9
) after torchvision/0.2.1-Python-3.6.4
, but then there is another deeper error:
capello@login2:~$ module purge
capello@login2:~$ module load GCC/6.4.0-2.28 OpenMPI/2.1.2
capello@login2:~$ module load Python/3.6.4
The following have been reloaded with a version change:
1) zlib/1.2.8 => zlib/1.2.11
capello@login2:~$ module load PyTorch/0.3.0-Python-3.6.4
capello@login2:~$ module load torchvision/0.2.1-Python-3.6.4
capello@login2:~$ module load Pillow/5.0.0-Python-3.6.4
capello@login2:~$ python
Python 3.6.4 (default, Apr 25 2018, 10:28:12)
[GCC 6.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
[...]
ImportError: liblzma.so.0: cannot open shared object file: No such file or directory
>>>
I am recompiling Python/3.6.4
…
Thx, bye,
Luca
Hi there,
For whatever reason, I have troubles recompiling Python/3.6.4
, thus in the meantime I tried installing on Python/3.6.6
the latest torchvision
via PIP (cf. https://baobab.unige.ch/enduser/src/enduser/applications.html#custom-python-lib ):
capello@login2:~$ module load GCC/7.3.0-2.30 CUDA/9.2.88 OpenMPI/3.1.1
capello@login2:~$ module load Python/3.6.6
capello@login2:~$ virtualenv --no-site-packages ~/test-torchvision-Python-3.6.6
Using base prefix '/opt/ebsofts/Python/3.6.6-fosscuda-2018b'
New python executable in /home/users/c/capello/test-torchvision-Python-3.6.6/bin/python
Installing setuptools, pip, wheel...done.
capello@login2:~$ . ~/test-torchvision-Python-3.6.6/bin/activate
(test-torchvision-Python-3.6.6) capello@login2:~$ pip install torchvision==0.2.2.post3
Collecting torchvision==0.2.2.post3
[...]
Installing collected packages: pillow, numpy, six, future, torch, torchvision
Successfully installed future-0.18.2 numpy-1.19.2 pillow-7.2.0 six-1.15.0 torch-1.6.0 torchvision-0.2.2.post3
(test-torchvision-Python-3.6.6) capello@login2:~$ python
Python 3.6.6 (default, Apr 15 2020, 16:42:51)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
>>> quit()
(test-torchvision-Python-3.6.6) capello@login2:~$ pip install torchvision==0.7.0
Collecting torchvision==0.7.0
[...]
Successfully installed torchvision-0.7.0
(test-torchvision-Python-3.6.6) capello@login2:~$ python
Python 3.6.6 (default, Apr 15 2020, 16:42:51)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
>>> quit()
capello@login2:~$
@Guy-Raphael.Stauffer , could this be a solution or you strictly needs Python/3.6.4
?
Thx, bye,.
Luca
I don’t need a special version of python, so python 3.6.6 could be a solution for me
Hello,
I tried with python 3.6.6, and everything works fine.
Thank you very much for your help.
best regards,
Guy-Raphaël