Intel® SHMEM

Intel® SHMEM provides an efficient implementation of GPU-initiated communication on systems with Intel GPUs.

Prerequisites

Linux OS
Intel® oneAPI DPC++/C++ Compiler 2025.0 or higher.

SYCL support

Intel® oneAPI DPC++/C++ Compiler with Level Zero support.

Installation

Building Level Zero

For detailed information on Level Zero, refer to the Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver repository or to the installation guide for oneAPI users.

To install, download the oneAPI Level Zero from the repository.

git clone https://github.com/oneapi-src/level-zero.git

Build Level Zero following instructions below.

cd level-zero
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=<level_zero_dir> ..
make -j
make install

The Host Backend Library

Intel® SHMEM requires a host OpenSHMEM or MPI backend to be used for scale-out communication. In particular, the OpenSHMEM backend relies on a collection of extension APIs (shmemx_heap_create, shmemx_heap_preinit, and shmemx_heap_postinit) to coordinate the Intel® SHMEM and OpenSHMEM heaps. We recommend Sandia OpenSHMEM v1.5.3 or newer for this purpose. A work-in-progress branch of OSHMPI is also supported but is currently considered experimental. See the Building OSHMPI section before for more details.

We recommend the Intel® MPI Library as the MPI backend option for the current version of Intel® SHMEM. See the Building Intel® SHMEM section below for more details.

Building Sandia OpenSHMEM (SOS)

Download the SOS repo to be configured as a backend for Intel® SHMEM.

git clone -b v1.5.3 --recurse-submodules https://github.com/Sandia-OpenSHMEM/SOS.git SOS

Build SOS following instructions below. FI_HMEM support in the provider is required for use with Intel® SHMEM. To enable FI_HMEM with a supported provider, we recommend a specific set of config flags. Below are two examples for configuring and building SOS with two providers supporting FI_HMEM. To configure SOS with the verbs;ofi_rxm provider, use the following instructions:

cd SOS
./autogen.sh
CC=icx CXX=icpx ./configure --prefix=<shmem_dir> --with-ofi=<ofi_installation> --enable-pmi-simple --enable-ofi-mr=basic --disable-ofi-inject --enable-ofi-hmem --disable-bounce-buffers --enable-hard-polling
make -j
make install

To configure SOS with the HPE Slingshot provider cxi, please use the following instructions:

cd SOS
./autogen.sh
CC=icx CXX=icpx ./configure --prefix=<shmem_dir> --with-ofi=<ofi_installation> --enable-pmi-simple --enable-ofi-mr=basic --disable-ofi-inject --enable-ofi-hmem --disable-bounce-buffers --enable-ofi-manual-progress --enable-mr-endpoint --disable-nonfetch-amo --enable-manual-progress
make -j
make install

To configure SOS with the psm3 provider, please use the following instructions:

cd SOS
./autogen.sh
CC=icx CXX=icpx ./configure --prefix=<shmem_dir> --with-ofi=<ofi_installation> --enable-pmi-simple --enable-manual-progress --enable-ofi-hmem --disable-bounce-buffers --enable-ofi-mr=basic --enable-mr-endpoint
make -j
make install

Please choose an appropriate PMI configure flag based on the available PMI client library in the system. Please check for further instructions on SOS Wiki pages. Optionally, users may also choose to add --disable-fortran since fortran interfaces will not be used.

Building OSHMPI (Optional and experimental)

Intel® SHMEM has experimental support for OSHMPI when built using the Intel® MPI Library. Here is information on how to Get Started with Intel® MPI Library on Linux.

To download the OSHMPI repository:

git clone -b wip/ishmem --recurse-submodules https://github.com/davidozog/oshmpi.git oshmpi

After ensuring the Intel® MPI Library is present in the environment, please build OSHMPI following the instructions below.

cd oshmpi
./autogen.sh
CC=mpiicx CXX=mpiicpx ./configure --prefix=<shmem_dir> --disable-fortran --enable-rma=direct --enable-amo=direct --enable-async-thread=yes
make -j
make install

Building Intel® SHMEM

Check that the SOS build process has successfully created a <shmem_dir> directory with include and lib as subdirectories. Please find shmem.h and shmemx.h in include.

Build Intel® SHMEM with an OpenSHMEM backend using the following instructions:

cd ishmem
mkdir build
cd build
cmake .. -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DENABLE_OPENSHMEM=ON -DSHMEM_DIR=<shmem_dir> -DCMAKE_INSTALL_PREFIX=<ishmem_install_dir>
make -j

Alternatively, Intel® SHMEM can be built by enabling an Intel® MPI Library backend. Here is information on how to Get Started with Intel® MPI Library on Linux.

cmake .. -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DENABLE_MPI=ON -DMPI_DIR=<impi_dir> -DCMAKE_INSTALL_PREFIX=<ishmem_install_dir>

where <impi_dir> is the path to the Intel® MPI Library installation.

Enabling both the OpenSHMEM and MPI backends is also supported. You may specify the default runtime at configure time with -DISHMEM_DEFAULT_RUNTIME=<backend> where <backend> is MPI or OPENSHMEM (case-insensitive). Furthermore, the desired backend can be selected at runtime via the environment variable ISHMEM_RUNTIME=<backend> where is MPI or OPENSHMEM (case-insensitive). If a default runtime is not specified, it will be automatically selected from the enabled backends in the following order: OPENSHMEM then MPI.

Usage

Launching Example Application

Validate that Intel® SHMEM was built correctly by running an example program.

Add the path for the backend library to the environment, for example:

export LD_LIBRARY_PATH=<shmem_dir>/lib:$LD_LIBRARY_PATH
source <impi_dir>/env/vars.sh

Run the example program or test on an allocated node using a process launcher:

ISHMEM_RUNTIME=<backend> mpiexec.hydra -n 2 -hosts <allocated_node_id> ./scripts/ishmrun ./test/unit/int_get_device

Note: Current supported launchers include: MPI process launchers (i.e. mpiexec, mpiexec.hydra, mpirun, etc.), Slurm (i.e. srun, salloc, etc.), and PBS (i.e. qsub).
Note: Intel® SHMEM execution model requires applications to use a 1:1 mapping between PEs and GPU devices. Attempting to run an application without the ishmrun launch script may result in failure if this mapping is not maintained.
- For further details on device selection, please see the ONEAPI_DEVICE_SELECTOR.

Validate the application ran successfully; example output:

Selected device: Intel(R) Data Center GPU Max 1550
Selected vendor: Intel(R) Corporation
Selected device: Intel(R) Data Center GPU Max 1550
Selected vendor: Intel(R) Corporation
No errors
No errors

Launching Example Application w/ CTest

ctest can be used to run Intel® SHMEM tests that are generated at compile-time. To see a list of tests available via ctest, run:

ctest -N

To launch a single test, execute:

ctest -R <test_name>

Alternatively, all the tests in a directory (such as test/unit/) can be run with the following command:

ctest --test-dir <directory_name>

By default, a passed or failed test can be detected by the output:

    Start 69: sync-2-gpu
1/1 Test #69: sync-2-gpu .......................   Passed    2.29 sec

100% tests passed, 0 tests failed out of 1

To have a test's output printed to the console, add either the --verbose or --output-on-failure flag to the ctest command

Available Scheduler Wrappers for Jobs Run via CTest

The following values may be assigned to CTEST_LAUNCHER at configure-time (ex. -DCTEST_LAUNCHER=mpi) to set which scheduler will be used to run tests launched through a call to ctest:

srun (default)
- Launches CTest jobs on a single node using Slurm's srun.
mpi
- Uses mpirun to launch CTest jobs with the appropriate number of processes.
qsub
- Launches CTest jobs on a single node using qsub. If this option is being used on a system where a reservation must be made (i.e. via pbsresnode) prior to running a test, assign the JOB_QUEUE environment variable to the queue associated with your reservation:
```
export JOB_QUEUE=<queue>
```

Hardware-specific Environment Settings

The following environment settings are either required or recommended when running Intel® SHMEM on the specified hardware. For GPU-specific environment settings, the launch script ishmrun will automatically detect and set the appropriate environment. For interconnect-specific environment settings, it is up to the user to ensure the appropriate environment is set:

HPE Slingshot Interconnect
- FI_CXI_OPTIMIZED_MRS=0 is required when running with an OpenSHMEM backend.
- FI_CXI_DEFAULT_CQ_SIZE=131072 is recommended for all backends.
Mellanox ConnectX® Interconnects
- MLX5_SCATTER_TO_CQE=0 is required when running with an OpenSHMEM backend.
Intel® Data Center GPU Max Series
- EnableImplicitScaling=0 is required. Note: you will also need to ensure NEOReadDebugKeys=1 in case it is not already set.
Intel® Arc™ B-Series GPUs
- RenderCompressedBuffersEnabled=0 is required. Note: you will also need to ensure NEOReadDebugKeys=1 in case it is not already set.

Additional Resources

OpenSHMEM Specification

Intel® SHMEM Specification

Intel® SHMEM Specification

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
cmake		cmake
docs		docs
examples		examples
pkgconfig		pkgconfig
pmi-simple		pmi-simple
scripts		scripts
src		src
test		test
.clang-format		.clang-format
.clang-format-ignore		.clang-format-ignore
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
SECURITY.md		SECURITY.md
third-party-programs.txt		third-party-programs.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Intel® SHMEM

Table of Contents

Prerequisites

SYCL support

Installation

Building Level Zero

The Host Backend Library

Building Sandia OpenSHMEM (SOS)

Building OSHMPI (Optional and experimental)

Building Intel® SHMEM

Usage

Launching Example Application

Launching Example Application w/ CTest

Available Scheduler Wrappers for Jobs Run via CTest

Hardware-specific Environment Settings

Additional Resources

OpenSHMEM Specification

Intel® SHMEM Specification

About

Uh oh!

Releases 6

Uh oh!

Contributors 2

Languages

License

oneapi-src/ishmem

Folders and files

Latest commit

History

Repository files navigation

Intel® SHMEM

Table of Contents

Prerequisites

SYCL support

Installation

Building Level Zero

The Host Backend Library

Building Sandia OpenSHMEM (SOS)

Building OSHMPI (Optional and experimental)

Building Intel® SHMEM

Usage

Launching Example Application

Launching Example Application w/ CTest

Available Scheduler Wrappers for Jobs Run via CTest

Hardware-specific Environment Settings

Additional Resources

OpenSHMEM Specification

Intel® SHMEM Specification

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 6

Uh oh!

Contributors 2

Languages