[Question]: Single process with multiple threads pure RDMA data transfer on multiple nodes.

### Question

My program controls multiple GPUs on a single node using a single process with multiple threads (like 1 process and 8 threads to control 8 gpus on a node). How can I implement pure RDMA transfer between GPUs on single node or multiple nodes using nvshmem? (like gpu0-on-node0 -> gpu1-on-node0 or gpu2-on-node0 -> gpu3-on-node1)

I've read the documentation, and it seems that the nvshmem host API doesn't support concurrency. I can't find a way to create 2 x 8 PEs to mapping 2 x 8 GPUs (two nodes and each nodes has eight gpus) using single process with multiple threads. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question]: Single process with multiple threads pure RDMA data transfer on multiple nodes. #48

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question]: Single process with multiple threads pure RDMA data transfer on multiple nodes. #48

Description

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions