Skip to content

Configuring openmpi5.0.8 against existing xpmem module #13492

@hpc-harlequin

Description

@hpc-harlequin

Hello dear OMPI community,
I was trying to build [email protected] using spack on a Red Hat Enterprise Linux release 9.4 (Plow) system.
My spack command was spack install [email protected] +internal-hwloc +openshmem +internal-pmix schedulers=slurm fabrics=ucx,xpmem,knem (sidenote: I am not sure if having knem and xpmem farbics makes sense).

For context:
Which results in --with-cray-xpmem, but since I'm not on a cray system I edited the spack package to link to the local xpmem module with --with-xpmem=/usr --with-xpmem=$HOME/buildsources/xpmem_src, since on this system I only have the module shared object and no headers I provide the xpmem source files for the include (made sure it's the right version).

The xpmem package is installed like so:

$ rpm -qi xpmem
Name        : xpmem
Version     : 2.7.3
$ rpm -ql xpmem
/lib/udev/rules.d/56-xpmem.rules
/usr/lib/modules-load.d/xpmem.conf
/usr/share/doc/xpmem
/usr/share/doc/xpmem/AUTHORS
/usr/share/doc/xpmem/COPYING
/usr/share/doc/xpmem/COPYING.LESSER
/usr/share/doc/xpmem/README
$ find /usr/ -name "*xpmem*"
/usr/lib/modules-load.d/xpmem.conf
/usr/lib/modules/5.14.0-427.49.1.el9_4.x86_64/extra/xpmem
/usr/lib/modules/5.14.0-427.49.1.el9_4.x86_64/extra/xpmem/xpmem.ko
/usr/lib/modules/5.14.0-427.50.1.el9_4.x86_64/weak-updates/xpmem
/usr/lib/modules/5.14.0-427.50.1.el9_4.x86_64/weak-updates/xpmem/xpmem.ko
/usr/lib/udev/rules.d/56-xpmem.rules
/usr/lib64/libxpmem.so.0
/usr/lib64/ucx/libuct_xpmem.so.0
/usr/lib64/ucx/libuct_xpmem.so.0.0.0
/usr/lib64/libxpmem.so.0.0.0

But that didn't work so I tried a minimal test with ./configure --disable-silent-rules --with-xpmem=$HOME/buildsources/xpmem_src --with-xpmem-libdir=/usr/lib64.
I tried this on a clean tarball extraction for openmpi 5.0.8.
Here is the configure output configure.log and config.log.

Where it tells me that

checking for xpmem_make... no
configure: error: XPMEM support requested but not found.  Aborting

which i'm guessing is from this little c blob in configure:

extern void xpmem_make (void);
int main(int argc, char *argv[]) {
    xpmem_make ();
    return 0;
}

However when I try to build this little file with gcc check-for-xpmem_make.c -l:libxpmem.so.0 -L /usr/lib64 it seems to work fine. So xpmem_make also definitely exists:

$ nm -D /usr/lib64/libxpmem.so.0
                 w __cxa_finalize@GLIBC_2.2.5
                 U __errno_location@GLIBC_2.2.5
                 U fcntl@GLIBC_2.2.5
                 w __gmon_start__
                 U ioctl@GLIBC_2.2.5
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U open@GLIBC_2.2.5
                 U __stack_chk_fail@GLIBC_2.4
                 U stat@GLIBC_2.33
00000000000014f0 T xpmem_attach
0000000000001570 T xpmem_detach
0000000000001430 T xpmem_get
0000000000001210 T xpmem_init
00000000000012c0 T xpmem_ioctl
0000000000001370 T xpmem_make
00000000000014a0 T xpmem_release
00000000000013e0 T xpmem_remove
00000000000015c0 T xpmem_version

This seems to be some kinda linking issue in the configure test. Is there something I can do to fix this or circumvent it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions