GPU memory dump with cuda-gdb python

dtmoodie · November 17, 2016, 10:04pm

Hello,

I have a script that I use for debugging OpenCV images. GitHub - dtmoodie/GDB-ImageWatch: GDB version of VS extension Image Watch (for debugging OpenCV)
This works correctly for host side memory but I’d like to extend it to device memory if possible.

On the host side, I can call gdb.inferiors()[0].read_memory(data_address, size) to dump process memory using the gdb python interface. However when I do this for device memory, I get all zeros for output.
Is this just not implemented in cuda-gdb or is there something else I need to call?

veraj · November 21, 2016, 7:48am

Hi, dtmoodie

I have raised a questin to related cuda-gdb dev. Let’s wait their responsing.

Best Regards
veraj

veraj · February 20, 2017, 5:52am

Hi, dtmoodie

Sorry for the late response.

Here is the workaround.

Following workaround allows reading device memory from python. For example, using an application, and breaking before the kernel launch (line 52), 0x10207600000 is the address of the array on the device:

(cuda-gdb) l
47
48 HANDLE_CUDA_ERROR(cudaMalloc((void**)&d, sizeof(int)N));
49 HANDLE_CUDA_ERROR(cudaEventCreate(&asyncWaitEvent));
50 HANDLE_CUDA_ERROR(cudaMemcpy(d, idata, sizeof(int)N, cudaMemcpyHostToDevice));
51
52 bitreverse<<<1, N, Nsizeof(int)>>>(d);
53 HANDLE_CUDA_ERROR(cudaGetLastError());
54 HANDLE_CUDA_ERROR(cudaEventRecord(asyncWaitEvent, 0));
55
56 / Spin on the host while kernel is running */
(cuda-gdb) p d
$3 = (void *) 0x10207600000
(cuda-gdb) python

def cuda_read_global_memory(addr): return gdb.parse_and_eval(“*(@global unsigned int *)%s” % (addr));
print(cuda_read_global_memory(0x10207600000))
print(cuda_read_global_memory(0x10207600004))
print(cuda_read_global_memory(0x10207600008))
0
1
2

BTW - @local, @shared, @global, and @generic all also work. For more information on cuda address specifiers, please look here:
http://docs.nvidia.com/cuda/cuda-gdb/#inspecting-program-state