Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance documentaitons #54

Open
adrianchiris opened this issue Feb 20, 2022 · 1 comment
Open

Enhance documentaitons #54

adrianchiris opened this issue Feb 20, 2022 · 1 comment

Comments

@adrianchiris
Copy link
Collaborator

We should improve the projects README.

the general way to use it with k8s is utilizing secondary network CNI such as macvlan or ipoib (or any CNI essentially can create virtual interfaces on top of existing RDMA capable parent netdev)

we should update instructions and examples.

@ppkube
Copy link

ppkube commented Mar 9, 2023

I didn't use the secondary network CNI.

Summary

setup:

fabric: InfiniBand
pod network:  the cluster-default networking is used

Test results

The gpudirect test finished successfully.

image: mellanox/cuda-perftest:latest
server cmd: ib_write_bw -a -F --report_gbits -q 2 --use_cuda 0
client cmd: ib_write_bw -a -F --report_gbits -q 2 --use_cuda 0 <server-pod-default-network-IP>

Note: ib_write_bw -R reported error. Not sure why rdma_cm isn't usable in the container, as the device file looks normal.

/dev/infiniband:
total 0
crw------- 1 root root 231,  64 Mar  6 04:13 issm0
crw-rw-rw- 1 root root  10,  56 Mar  6 04:13 rdma_cm
crw------- 1 root root 231,   0 Mar  6 04:13 umad0
crw-rw-rw- 1 root root 231, 192 Mar  6 04:13 uverbs0

pasting server side only:

root@rdma-gpu-test-pod-1:~# ib_write_bw -a -F --report_gbits -q 2 --use_cuda 0

************************************
* Waiting for client to connect... *
************************************
initializing CUDA
Listing all CUDA devices in system:
CUDA device 0: PCIe address is B5:00

Picking device No. 0
[pid = 28, dev = 0] device name = [NVIDIA A30]
creating CUDA Ctx
making it the current CUDA Ctx
cuMemAlloc() of a 33554432 bytes GPU buffer
allocated GPU buffer address at 00007f435a000000 pointer=0x7f435a000000
---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF          Device         : mlx5_0
 Number of qps   : 2            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 CQ Moderation   : 100
 Mtu             : 4096[B]
 Link type       : IB
 Max inline data : 0[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x04 QPN 0x002a PSN 0x34a7d4 RKey 0x1fdfbf VAddr 0x007f435b000000
 local address: LID 0x04 QPN 0x002c PSN 0x6b572e RKey 0x1fdfbf VAddr 0x007f435b800000
 remote address: LID 0x03 QPN 0x002a PSN 0x8a0c0e RKey 0x1fdfbf VAddr 0x007f8b8d000000
 remote address: LID 0x03 QPN 0x002b PSN 0xc19317 RKey 0x1fdfbf VAddr 0x007f8b8d800000
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
 8388608    10000            98.39              98.39              0.001466
---------------------------------------------------------------------------------------
deallocating RX GPU buffer 00007f435a000000
destroying current CUDA Ctx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants