Numpy et tensorlayer

I would like to run some tensorflow codes that requires tensorlayer library (https://tensorlayer.readthedocs.io/en/latest/) and numpy. I thought that I could create a virtualenv and install all my dependencies her but virtualenv seem not working.

I load tensorflow as following

module load GCC/8.2.0-2.31.1  OpenMPI/3.1.3
module load TensorFlow/2.0.0-Python-3.7.2

But I have an error…

Here is my slurm out file

Inactive Modules:
  1) Python/3.7.2      4) Tcl/8.6.9         7) libpciaccess/0.14    10) ncurses/6.1
  2) SQLite/3.27.2     5) XZ/5.2.4          8) libreadline/8.0      11) numactl/2.0.12
  3) Szip/2.1.1        6) hwloc/1.11.11     9) libxml2/2.9.8

Due to MODULEPATH changes, the following have been reloaded:
  1) GMP/6.1.2     2) bzip2/1.0.6     3) libffi/3.2.1     4) zlib/1.2.11

The following have been reloaded with a version change:
  1) GCCcore/8.2.0 => GCCcore/6.3.0     2) binutils/2.31.1 => binutils/2.27


Activating Modules:
  1) Python/3.6.1      3) Tcl/8.6.6     5) libreadline/7.0
  2) SQLite/3.17.0     4) XZ/5.2.3      6) ncurses/6.0

Traceback (most recent call last):
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/opt/ebsofts/MPI/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/Python/3.6.1/lib/python3.6/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
  File "/opt/ebsofts/MPI/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/Python/3.6.1/lib/python3.6/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: /opt/ebsofts/Core/GCCcore/6.3.0/lib64/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/_pywrap_tensorflow_internal.so)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 9, in <module>
import tensorflow as tf
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow/__init__.py", line 98, in <module>
from tensorflow_core import *
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/__init__.py", line 40, in <module>
from tensorflow.python.tools import module_util as _module_util
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow/__init__.py", line 50, in __getattr__
module = self._load()
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow/__init__.py", line 44, in _load
module = _importlib.import_module(self.__name__)
  File "/opt/ebsofts/MPI/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/Python/3.6.1/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow.py", line 74, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
  File "/opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/opt/ebsofts/MPI/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/Python/3.6.1/lib/python3.6/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
  File "/opt/ebsofts/MPI/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/Python/3.6.1/lib/python3.6/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: /opt/ebsofts/Core/GCCcore/6.3.0/lib64/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /opt/ebsofts/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/tensorflow_core/python/_pywrap_tensorflow_internal.so)


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.
srun: error: gpu006: task 0: Exited with exit code 1

Hi,
It seem there is some library incompatibility between tensorflow and the local env.
If this cannot be fixed an alternative is to create a singularity image with all software you need. This solution even if more difficult and longer to put in place has the advantage to be more flexible as it allows to install nearly every software you want in a standalone package.

To do that you need to follow the following steps:

  1. Create a dockerfile using a base image which contain tensorflow https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow . And install all your python library you need with pip (no need of virtual env).
  2. Push your docker image to dockerhub .
  3. Transform the docker image to a singularity image.
  4. Run your code inside the image.

A minimal dockerfile look like that:

from https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow
pip install numpy

Hi @Pablo.Strasser,

many thanks for the reply. I’m currently pushing the docker image to docker hub. I have a doubt…when I will have convert the docker image to singularity image, I just have to write a launchjob.sh file on baobab which call the singularity module, run the image.simg and then run the command of my python script, something like:

#SBATCH --cpus-per-task=1
#SBATCH --job-name=testsrgan
#SBATCH --ntasks=1
#SBATCH --time=05:00
#SBATCH --output=slurm-%J.out
#SBATCH --gres=gpu:1
#SBATCH --constraint="V5|V6"
#SBATCH --partition=shared-gpu-EL7

module load GCC/6.3.0-2.27 Singularity/2.4.2
singularity run image.simg
srun python train.py

Many thanks for your assistance

No it would be more something like that:

srun singularity run --nv image.simg python train.py

By default the home folder is mounted inside the docker container you can mount additional folders with the -B command option. If you are using gpu you also need the --nv flag.

I hope this help.

Ok thanks. I think I’m lack of knowledge on docker and singularity since I wasn’t able to figure out how to transform a docker image that is on a repository on docker hub to a singularity image. Do you have some examples for this step?

You use the following command for that:

singularity build pathtoimage docker://dockerurl

Where pathtoimage is the path to where the image will be stored on baobab (you home folder or scratch directory are perfect for that) and dockerurl is the url of your image.

Many thanks. I have now an error said

tar: usr/src/tensorrt: Cannot open: File exists
tar: Exiting with failure status due to previous errors

after running singularity build test.img docker://lperozzi/baobab:test. I’mnot sure if it related with the version of the tensorflow image I use in Dockerfile. My Dockerfile:

# Base image on which we build our package.

from nvcr.io/nvidia/tensorflow:19.11-tf2-py3
# Add the name of the maintainer as Metadata.
MAINTAINER Lorenzo Perozzi lorenzo.perozzi@unige.ch
# Disable gpu access when building to improve portability.
# This avoid having the code being dependent on the GPU driver version used when building.
env NVIDIA_VISIBLE_DEVICES=void
# Update the list of package with an ubuntu command. The -y flag ensure that we automatically say yes to the prompt.
run apt-get update -y
# Install pip.
run apt-get install -y python3-pip
# Install tensorlayer numpy easydict.
run pip3 install --no-cache-dir easydict==1.9 tensorlayer>=2.0.0 numpy==1.16.1
# Enable gpu again because the user of the image would want to use the gpu.
env NVIDIA_VISIBLE_DEVICES=all

I tried to run singularity build lolcow.img docker://godlovedc/lolcow and allow me to create a singularity .img so the previous error is probably related to some conflicts with nvcr.io/nvidia/tensorflow:19.11-tf2-py3?

Using the last version of singularity work.

PATH=$PATH:/usr/sbin
module load GCCcore/8.2.0 Singularity/3.4.0-Go-1.12
singularity build myimage.simg docker://pablostrasser/baobab_tensorlayer:latest

Ok! many thanks @Pablo.Strasser