Hi,
I’m having trouble getting GPU debugging working with NSight and Visual Studio 2017.
When I “Start CUDA Debugging” it fails to stop at any breakpoints in CUDA Kernel functions and the Visual Studio Output Window for Nsight shows the following:
CUDA context created : 163fed0b990
CUDA module loaded: 16386067b70 foo.cubin
CUDA grid launch failed: CUcontext: 1528988481936 CUmodule: 1526961961840 Function: bar
The CUDA Info window shows Functions:
CUcontext:163fed0b990, CUmodule:16386067b70, Name: bar, Function Address: 0x00073f80, Size:64, Mangled Name: bar
Note that the Context and Module addresses listed for the failed launch are different from the context created and module loaded.
Is this the problem?
Visual Studio also reports:
Your app has entered a break state, but no code is currently executing that is supported by the selected debug engine.
The NSight Monitor shows successful connection from Debugger (WAYNE-PC)
I have Nvidia Quadro K1100M GPU.
The list of supported GPUs (https://developer.nvidia.com/nsight-visual-studio-edition-supported-gpus-full-list)
doesn’t include this specific card but does include the very closely related K1000M.
Configuration:
- Window 10, 64bit
- Visual Studio 2017, v15.4.5
- CUDA 9.0.176 win10
- NSight 5.4.0.17229
Same problem for all CUDA apps, including the Nsight Code Samples: CUDA Debugging - Matrix Multiply both Runtime and Driver versions.
I also tested with my own simple Hello World CUDA app:
foo.cu:
extern "C" void __global__ bar()
{
//printf("Hello");
}
foo.cu is compiled using:
nvcc --cubin -G -gencode=arch=compute_30,code=sm_30 foo.cu
main.cpp (compiled using x64 Debug configuration):
int main()
{
CUcontext context;
CUdevice device;
CUmodule fooModule;
CUfunction barFunction;
CHECK(cuInit(0));
CHECK(cuDeviceGet(&device, 0));
CHECK(cuCtxCreate(&context, 0, device));
CHECK(cuModuleLoad(&fooModule, "foo.cubin"));
CHECK(cuModuleGetFunction(&barFunction, fooModule, "bar"));
CHECK(cuLaunchKernel(barFunction, 1, 1, 1, 1, 1, 1, 0, NULL, NULL, NULL));
CHECK(cuCtxSynchronize());
}
The app runs fine if I start without using NSight (using standard Visual Studio debugger).
If I comment out the call to cuLaunchKernel then “Start CUDA Debugging” executes without error.