Skip to content

Munge support is missing (optional dependency for PMIx in software layer) #216

@bartoldeman

Description

@bartoldeman

Running EESSI on the Canadian cluster Narval with srun I see these warnings (not fatal):

$ srun --mpi=pmix ./hellompi
...
A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
PMIx stopped checking at the first component that it did not find.

Host:      nc11103.narval.calcul.quebec
Framework: psec
Component: munge
...
Hello world from processor nc11104.narval.calcul.quebec, rank 3 out of 4 processors
Hello world from processor nc11103.narval.calcul.quebec, rank 0 out of 4 processors
Hello world from processor nc11103.narval.calcul.quebec, rank 1 out of 4 processors
Hello world from processor nc11104.narval.calcul.quebec, rank 2 out of 4 processors

This apparently happens when libpmix that Slurm uses has been compiled with munge but the module has not (caveat: I don't know what happens if it's the other way around i.e. system pmix without munge but module pmix with munge...)

this can be silenced using

export PMIX_MCA_psec=^munge

but it may be better to add libmunge to the compatibility layer and rebuild PMIx so it's picked up? An example ebuild we are using ourselves is here:
https://github.com/ComputeCanada/gentoo-overlay/blob/main/sys-auth/munge/munge-0.5.15.ebuild
Of course libmunge could also be provided by a module.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions