Hello,
Recent versions of Tensorflow (>=2.5.0) on PyPI were built using cuDNN 8.1 and CUDA 11.2 (Build from source | TensorFlow).
I have tried to use the latest SW stack available on baobab (cuDNN/8.0.4.30-CUDA-11.1.1) with Tensorflow 2.5.0, but that does not seem to work properly when using Ampere cards.
Would it be possible to update the cuDNN/CUDA stack on baobab?
Thank you!
Wow, thanks!
I’m testing TensorFlow 2.6.0 with the new stack and everything seems to be working on Ampere nodes…
Great! By curiosity, how do you install/launch TensorFlow? pip, container?
I use venv, here my setup script:
module load GCCcore/10.3.0 Python/3.9.5 cuDNN/8.2.1.32-CUDA-11.3.1
python -m venv sbdenv
source sbdenv/bin/activate
pip install --upgrade pip
pip install jupyterlab scipy tensorflow matplotlib numpy scikit-learn ipympl keras-tuner tensorflow-addons umap-learn tensorboard-plugin-profile tqdm
I have used conda in the past, but I find this approach easier to manage…
Hey Yann,
Tensorflow is now requiring cuDNN v8.6.0:
2023-04-11 10:39:53.416414: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr
2023-04-11 10:39:53.423614: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.2.1 but source was compiled with: 8.6.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-04-11 10:39:53.442479: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr
Could you please update the stack once again?
For compatibility, here a list of the modules I am currently loading:
module load GCCcore/10.2.0 Tkinter/3.8.6 Python/3.8.6 cuDNN/8.2.1.32-CUDA-11.3.1 git-lfs/3.1.2 FFmpeg/4.3.1 nodejs/12.19.0
Thank you very much!
Cheers
Hello,
cuDNN 8.6 is now available on both clusters.
Best regards,
1 Like
Great, thank you very much!