[tutorial] launch openpose with GPU support through Singularity

A user asked us to install a software named openpose on Baobab. This software needs a lot of dependencies and we didn’t had enough time to install it. More an more software are available as Docker image as well. That’s great because we support Docker like images through Singularity.

Steps:

  • build the image for singularity
  • launch the container

According to this page, the Docker image of openpose is available as garyfeng/docker-openpose.

Let’s build the Singularity image openpose. This will download the Docker images and takes some time. It will also produce some warnings because we are running this command as normal user instead as user root.

[sagon@login2 openpose] module load GCCcore/8.2.0 Singularity/3.4.0-Go-1.12
[sagon@login2 openpose] PATH=$PATH:/sbin singularity build openpose.simg docker://garyfeng/docker-openpose:latest
[...]
INFO:    Creating SIF file...
INFO:    Build complete: openpose.simg

As I don’t know how this software works, let’s try a quick test.

[sagon@login2 openpose][master] $ sbatch launchopenpose.sh 
Submitted batch job 25958248
[sagon@login2 openpose][master] $ ls -la ~/tests/output.avi 
-rw-r--r-- 1 sagon unige 17932086 Jan  8 17:49 /home/sagon/tests/output.avi
[sagon@login2 openpose][master] $ cat slurm-25958248.out 
Starting pose estimation demo.
Auto-detecting GPUs... Detected 1 GPU(s), using them all.
Starting thread(s)
Real-time pose estimation demo successfully finished. Total time: 39.014532 seconds.

You can download and modify the script to your need.

Your feedback is welcome.

Cheers

Thanks for the tutorial.
A very useful singularity flag is the -B flag which allows to mount an external folder to a folder inside the container. This is very useful when manipulating huge data file which should be on scratch and not in the home directory.
For example the flag

-B /home/strassp6/scratch/Data:/Data

will mount the Data folder on scratch to /Data.

To add more information about what Yann said. A good source of docker images is dockerhub and especially for gpu [ngc nvidia](https://ngc.nvidia.com/].

Hello, thank you for the tutorial.

I’m interested to do the same but with AlphaPose.

I already have a docker ready so now I want to test it on baobab.
Docker hub / Dockerfile

I’m no docker expert so if you have any advice/recommendation to improve it feel free. I would love to learn.

Anyway, when trying your tuto, It looks like singularity is no more available on baobab at least not through module.

module spider Singularity does not return anything.

Is it still possible to run docker on it ? And if so, how (taking into account the need for GPU and mounted directory) ?

Thank you for your time.

Dear @Thibaut.Chataing

Singularity was replaced by Apptainer some time ago. The functionality are mainly the same.

If you need to access a GPU through apptainer, you need to use
--nv to enable Nvidia support

1 Like

Hello,
I’m trying to use openpose through a docker either with " aryfeng/docker-openpose:latest" or " tchataing/op:latest" but i’m confronted with a CUDA error.

Error:

Cuda check failed (209 vs. 0): no kernel image is available for execution on the device

Coming from:

  • /opt/openpose/src/openpose/net/netCaffe.cpp:initializationOnThread():187

  • /opt/openpose/src/openpose/gpu/cuda.cpp:cudaCheck():37

  • /opt/openpose/src/openpose/net/netCaffe.cpp:initializationOnThread():208

  • /opt/openpose/src/openpose/pose/poseExtractorCaffe.cpp:addCaffeNetOnThread():106

  • /opt/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():632

  • /opt/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():53

  • /opt/openpose/include/openpose/pose/wPoseExtractor.hpp:work():107

  • /opt/openpose/include/openpose/thread/worker.hpp:checkAndWork():93

Error:

Cuda check failed (209 vs. 0): no kernel image is available for execution on the device

Coming from:

  • /opt/openpose/src/openpose/net/netCaffe.cpp:initializationOnThread():187

  • /opt/openpose/src/openpose/gpu/cuda.cpp:cudaCheck():37

  • /opt/openpose/src/openpose/net/netCaffe.cpp:initializationOnThread():208

  • /opt/openpose/src/openpose/pose/poseExtractorCaffe.cpp:addCaffeNetOnThread():106

  • /opt/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():632

  • /opt/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():53

  • /opt/openpose/include/openpose/pose/wPoseExtractor.hpp:work():107

  • /opt/openpose/include/openpose/thread/worker.hpp:checkAndWork():93

Error occurred on a thread. OpenPose closed all its threads and then propagated the error to the main thread. Error description:

Cuda check failed (209 vs. 0): no kernel image is available for execution on the device

Coming from:

  • /opt/openpose/src/openpose/net/netCaffe.cpp:initializationOnThread():187

  • /opt/openpose/src/openpose/gpu/cuda.cpp:cudaCheck():37

  • /opt/openpose/src/openpose/net/netCaffe.cpp:initializationOnThread():208

  • /opt/openpose/src/openpose/pose/poseExtractorCaffe.cpp:addCaffeNetOnThread():106

  • /opt/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():632

  • /opt/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():53

  • /opt/openpose/include/openpose/pose/wPoseExtractor.hpp:work():107

  • /opt/openpose/include/openpose/thread/worker.hpp:checkAndWork():93

  • [All threads closed and control returned to main thread]

  • /opt/openpose/src/openpose/utilities/errorAndLog.cpp:checkWorkerErrors():280

  • /opt/openpose/include/openpose/thread/threadManager.hpp:stop():243

  • /opt/openpose/include/openpose/thread/threadManager.hpp:exec():202

  • /opt/openpose/include/openpose/wrapper/wrapper.hpp:exec():424

I pulled the docker image through
singularity build docker://... or singularity pull docker://...
and tried to run it but I can’t manage the CUDA version properly.

Do you have any insight on how CUDA should be handle between the container and the job ?

The first image is 8 year old, I highly doubt you’ll be able to run it on the cluster.

I’ve tried with a newer version from https://hub.docker.com/r/sandeepagowda/openpose (:warning: not an official release)

apptainer run --nv docker://sandeepagowda/openpose:CUDA12.0

I have updated the sbatch script here: o/openpose · master · hpc / softs · GitLab

If you have a small working example, contribution welcome.

Best

Hi,
Thanks for the quick answer.

I still have the same problem.

I tried it in two way.

  1. With a slurm job :

I run the following batch script.

#!/bin/sh

#SBATCH --partition=shared-gpu
#SBATCH --gpus=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --time=10:00

## https://hub.docker.com/layers/sandeepagowda/openpose/CUDA12.0/images/sha256-7cbc77f912e0b9e7d3035e6e5bc010d34e56a88f7e6a1a1ecca24963afa6cf01

OPENPOSE_URL=docker://sandeepagowda/openpose:CUDA12.0
OPENPOSE_SIMG=openpose.simg
VIDEO_INPUT=/home/share/schaer2/thibaut/opensam/input/balt.mp4
VIDEO_OUTPUT=/home/share/schaer2/thibaut/opensam/output

srun apptainer build $OPENPOSE_SIMG $OPENPOSE_URL


## not all the flags are working with this newer version
srun apptainer exec --nv --pwd /openpose $OPENPOSE_SIMG build/examples/openpose/openpose.bin --video $VIDEO_INPUT --write_video $VIDEO_OUTPUT --display 0
  1. By hand with :
(baobab)-[chataint@gpu008 thibaut]$ apptainer build openpose.simg docker://sandeepagowda/openpose:CUDA12.0
(baobab)-[chataint@gpu008 thibaut]$ apptainer run --nv openpose.simg
Apptainer> cd /openpose/
Apptainer> build/examples/openpose/openpose.bin --display 0 --video /home/share/schaer2/thibaut/opensam/input/balt.mp4 --write_video /home/share/schaer2/thibaut/opensam/output
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.

Error:
Cuda check failed (209 vs. 0): no kernel image is available for execution on the device

Coming from:
- /openpose/src/openpose/net/netCaffe.cpp:reshapeNetCaffe():115
- /openpose/src/openpose/gpu/cuda.cpp:cudaCheck():37
- /openpose/src/openpose/net/netCaffe.cpp:reshapeNetCaffe():120
- /openpose/src/openpose/net/netCaffe.cpp:forwardPass():259
- /openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():632
- /openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():53
- /openpose/include/openpose/pose/wPoseExtractor.hpp:work():107
- /openpose/include/openpose/thread/worker.hpp:checkAndWork():93

Error occurred on a thread. OpenPose closed all its threads and then propagated the error to the main thread. Error description:

Cuda check failed (209 vs. 0): no kernel image is available for execution on the device

Coming from:
- /openpose/src/openpose/net/netCaffe.cpp:reshapeNetCaffe():115
- /openpose/src/openpose/gpu/cuda.cpp:cudaCheck():37
- /openpose/src/openpose/net/netCaffe.cpp:reshapeNetCaffe():120
- /openpose/src/openpose/net/netCaffe.cpp:forwardPass():259
- /openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():632
- /openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():53
- /openpose/include/openpose/pose/wPoseExtractor.hpp:work():107
- /openpose/include/openpose/thread/worker.hpp:checkAndWork():93
- [All threads closed and control returned to main thread]
- /openpose/src/openpose/utilities/errorAndLog.cpp:checkWorkerErrors():280
- /openpose/include/openpose/thread/threadManager.hpp:stop():243
- /openpose/include/openpose/thread/threadManager.hpp:exec():202
- /openpose/include/openpose/wrapper/wrapper.hpp:exec():424

Feel free to use the same video “balt.mp4” for test if needed.

Thanks for the help.

It seems the issue is that openpose is compiled for a gpu compute capability not supported.

I guess you need to rebuild the openpose image. Maybe using the container image isn’t the way to go and you should try to build it from source.

Anyone using openpose with success on the cluster?