Is GPU Direct RDMA supported on Xavier? (Solved)

JohnU-PNSR · September 28, 2018, 9:31pm

We would like to utilize GPUDirect RDMA between the Xavier GPU and a PCIe card inserted in the PCIe slot. Is this possible with current Jetson AGX Xavier that we just received?

If it is not possible today then where is this functionality placed on the roadmap?

Thankyou for any information you can provide.

dusty_nv · September 29, 2018, 12:19am

The GPUDirect interface (like in the NVIDIA desktop discrete GPU PCIe driver) isn’t supported yet at this time, however we are tentatively planning it for a future JetPack release. Right now the Jetson’s integrated GPU driver is userspace, not a PCIe kernel module unlike the desktop discrete GPU version.

However since Jetson’s GPU and CPU share the same physical memory, theoretically if you were to allocate mapped memory with cudaHostAlloc(), and then send this to your PCIe device’s driver via an ioctl and were able to translate it from the virtual address to physical address, that may allow you to have your PCIe device DMA to the same memory that is mapped to the GPU (thus accomplishing zero copy between your device and GPU). I am not sure of the functions that perform the virt->physical memory address translation in the recent kernels.

JohnU-PNSR · September 29, 2018, 1:09am

Thanks for the prompt, concise and helpful response. I am grateful to not go into the weekend hoping about this. :-)

I will read your suggested approach again on monday and may…ummm…probably have further questions about how i tackle this. Your jump straight to zero-copy was apropos, because this is exactly why i wanted to be able to do this…

JohnU-PNSR · October 1, 2018, 3:11pm

Good Morning Dusty!

Based on your answer from friday, i need to take a step back from my assumptions and regain a collection of first principals.

Toward that end, is there some documentation and graphics i can peruse that outline how analyzing video frames coming in from a PCIe card is supposed to work?

Tnx!
johnu

dusty_nv · October 1, 2018, 3:25pm

Hi johnu, the normal way is via your typical Linux PCIe kernel driver with DMA. It typically mmaps the memory into user-space process, at which point the user can send it to the GPU. However, if you were to modify your PCIe driver to accept pointers for it to DMA to, instead of allocating it’s own, then you can provide it with pointers that have already been mapped to the GPU by the CUDA memory APIs.

JohnU-PNSR · October 1, 2018, 3:28pm

Thankyou very much Dusty!

JohnU-PNSR · March 20, 2019, 7:42pm

Hi Dusty et all;

Jetpack 4.2 has this line in it’s features:

“RDMA support on Jetson AGX Xavier”

Is this the Xavier equivalent of GPU Direct RDMA or not?

Please let me know when you have a moment.

Tnx!

johnu

StephenWarren · March 25, 2019, 8:30pm

“RDMA support on Jetson AGX Xavier”
Is this the Xavier equivalent of GPU Direct RDMA or not?

Yes, it is.

JohnU-PNSR · March 25, 2019, 8:34pm

Hi Stephen;

Thankyou for letting me know. Have you seen it work or seen any example code for us to try?

Tnx!

johnu

StephenWarren · March 25, 2019, 8:58pm

Yes, I have seen the feature work. We hope to release a sample project in the near future.

StephenWarren · April 5, 2019, 6:28pm

GitHub - NVIDIA/jetson-rdma-picoevb: Minimal HW-based demo of GPUDirect RDMA on NVIDIA Jetson AGX Xavier running L4T is an example of GPUDirect RDMA on Jetson, using a small FPGA board.

zh_sh_y · May 13, 2019, 2:56am

I tried the example of GPUDirect RDMA on Jetson TX2 (GitHub - NVIDIA/jetson-rdma-picoevb: Minimal HW-based demo of GPUDirect RDMA on NVIDIA Jetson AGX Xavier running L4T ).
run command： ./build-for-jetson-igpu-native.sh
but It is wrong.

/opt/pcie/kernel-module/picoevb-rdma.c:26:26: fatal error linux/nv-p2p.h: No such file or directory

Please let me know when you have a moment.

StephenWarren · May 13, 2019, 3:52pm

zh_sh_y,

Which L4T release are you using (only L4T r32.1 is supported), and which Jetson board are you compiling the code on (only Jetson AGX Xavier is supported)?

Please show the complete log from the build command.

efiryw2d · November 5, 2019, 7:51am

Hi Stephen,
Can i use this RDMA example 2 use two Xavier’s as root point and endpoint?

StephenWarren · November 5, 2019, 4:40pm

You don’t need to use anything in this example to connect two Xaviers over PCIe. However, you likely can use GPUDirect RDMA between two Xaviers, and so portions of this example may be useful in setting that up. The root port Xavier would export its memory using RDMA, and the endpoint Xavier would then access it. You’d need to develop some custom protocol (and drivers) to pass the memory pointers etc. back and forth between the two systems.

efiryw2d · November 10, 2019, 9:09am

Do you have any example, SDK, API?
Something i can start with?

StephenWarren · November 11, 2019, 3:28pm

The following is an example use of the API:

These will be useful background too:
https://devblogs.nvidia.com/gpudirect-rdma-nvidia-jetson-agx-xavier/
https://docs.nvidia.com/cuda/gpudirect-rdma/index.html#userspace-api

joseslava · September 2, 2020, 11:59am

Hello,

I would like to know if GpuDirect RDMA support for Jetson TX2 is planned on future JetPack releases?

Regards.

StephenWarren · September 2, 2020, 3:41pm

GPUDirect RDMA is a Xavier-only feature.

brtch1 · November 6, 2020, 12:34pm

Hi,

I’d like to know if GPUDirect RDMA is also supported on Jetson Xavier NX module?

Regards.