Help needed running MPI/Palabos software using Singularity

Dear community, dear HPC staff,

I am working on a project whose aim is to make certain Palabos biomedical applications available to a wider audience. This includes Palabos applications as well as a series of batch scripts and other applications (Meshlab) to pre-process medical imaging data provided as input to the application.

In order to facilitate the deployment of this to end user, I have created a docker image, which works perfectly.

I am currently trying to use this image on baobab (after converting it with the ‘singularity build’ command), but I get the following error message:

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[node001.cluster:00017] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
srun: error: node001: task 0: Exited with exit code 1

The sbatch script i’m using to test this is as follow :

#!/bin/bash
#
#SBATCH -J sfdm
#SBATCH -e sfdm-error.e%j
#SBATCH -o sfdm-out.o%j
#SBATCH --cpus-per-task=1
#SBATCH --tasks=1

module load GCCcore/8.2.0 Singularity/3.4.0-Go-1.12

srun singularity run -B $(pwd)/patient2:/biomed/mount biomed.simg stent sfdmsim

One should note that the entrypoint of the image doesn’t call mpirun (leaving that to ‘srun’).

Any feedback on how to solve this problem, or on which next debug step I should investigate would be welcome.

JF

Hello JF,

According to this page, you should have in your docker image an OpenMPI version compatible with the one you are using on the cluster. I think “compatible” means identical major version.

In you case, you didn’t loaded a MPI version in your sbatch script. You should load one.

Hi Yann,

Using module spider, I figured out that module to load to be “GCC/8.2.0-2.31.1 OpenMPI/3.1.3 Singularity/3.4.0-Go-1.12”.

So, I installed from source MPI 3.1.3 in my container, and built my palabos application against it. Testing in docker, shows that my applications working well.

Then running the converted container (by singularity build), gives me another error message :

[node001.cluster:00018] OPAL ERROR: Not initialized in file pmix2x_client.c at line 109
--------------------------------------------------------------------------
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[node001.cluster:00018] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
srun: error: node001: task 0: Exited with exit code 1

Any help appreciated.

If you want try it by yourself :

mkdir -p scratch/tmp
cd scratch/tmp
singularity build --force biomed-openmpi3.simg docker://registry.gitlab.com/unigespc/biomed-pub:openmpi3
module purge
module load GCC/8.2.0-2.31.1  OpenMPI/3.1.3  Singularity/3.4.0-Go-1.12
srun singularity exec biomed-openmpi3.simg /biomed/showmpiversion 

(showmpiversion is this : https://pastebin.com/tTDUyms2)

Excepted output (when running on a regular linux workstation) :

[jfburdet:lnxjfb]$ docker run -it --entrypoint /biomed/showmpiversion registry.gitlab.com/unigespc/biomed-pub:openmpi3
Open MPI v3.1.6, package: Open MPI root@2247d8d74647 Distribution, ident: 3.1.6, repo rev: v3.1.6, Mar 18, 2020

Hello JF,

you should probably compile OpenMPI against Slurm and pmi in your docker image.

This is what we have as options when we compile OpenMPI using EasyBuild on Baobab.

# to enable SLURM integration (site-specific)
# configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr'
configopts += '--with-slurm --with-pmi'

You can find the Slurm and pmi headers on login2.baobab.hpc.unige.ch.

[root@login2 ~]# rpm -ql slurm-devel-20.02.4-1+ipmi+hdf5.el7.x86_64
/usr/include/slurm
/usr/include/slurm/pmi.h
/usr/include/slurm/pmi2.h
/usr/include/slurm/slurm.h
/usr/include/slurm/slurm_errno.h
/usr/include/slurm/slurmdb.h
/usr/include/slurm/smd_ns.h
/usr/include/slurm/spank.h
/usr/lib64/pkgconfig
/usr/lib64/pkgconfig/slurm.pc

Salut Yann,

I wanted my container to have the least possible dependencies, so that my end user will be able to run agains any SLURM cluster.

Compiling my container content against a given version of SLURM will break that and - as I understand - don’t follow the singularity mantra “Bring your own environment to the cluster”.

Are you sure this is the only solution ?

(anyway, i’ll do some tests with --with-slurm configure option and come back later with a followup)

Hello,

If you compile against recent Slurm versions like we have on Baobab, it should be compatible with other cluster using Slurm as well. And if the cluster isn’t using Slurm nor pmi, it should work as well.

This is to be tested, it’s only a supposition.

You can as well try to deactivate pmi.

[sagon@login2 ~] $ srun --mpi=none mpirun hostname
node001.cluster

FYI, I built the openmpi inside container with “./configure --with-slurm” and this gives me the same error message.

But did you had access to the Slurm and pmi headers from within the container? and add the --pmi flag as well. I think you need to have access as well to the /usr/lib64/libpmi2.so and /usr/lib64/slurm/libslurmfull.so to be able to compile your soft.

Ok, I made some progress. I will post a working example soon as a reference.

I realize that the process I’m calling can sometime call TWO MPI process (one after the other) inside my “srun singularity” session : It seems that the first call run fine, but then the second call fails (broken pipe).

At first, I believe this was a bug, but then to realize this is certainly working as designed : SLURM should intercept MPI_Finalize call and then making further call to MPI_Init failing because it’s expecting the process to quit.

Yann, what do you think about that ? Does this explanation seem valid to you?

Dear all,

You’ll find a sample project in this repository : https://gitlab.com/jfburdet/mpi-sandbox.