CUDA grid launch failed

Hi,

I’m having trouble getting GPU debugging working with NSight and Visual Studio 2017.
When I “Start CUDA Debugging” it fails to stop at any breakpoints in CUDA Kernel functions and the Visual Studio Output Window for Nsight shows the following:

CUDA context created : 163fed0b990
CUDA module loaded:   16386067b70 foo.cubin
CUDA grid launch failed: CUcontext: 1528988481936 CUmodule: 1526961961840 Function: bar

The CUDA Info window shows Functions:

CUcontext:163fed0b990, CUmodule:16386067b70, Name: bar, Function Address: 0x00073f80, Size:64, Mangled Name: bar

Note that the Context and Module addresses listed for the failed launch are different from the context created and module loaded.
Is this the problem?

Visual Studio also reports:
Your app has entered a break state, but no code is currently executing that is supported by the selected debug engine.

The NSight Monitor shows successful connection from Debugger (WAYNE-PC)

I have Nvidia Quadro K1100M GPU.
The list of supported GPUs (https://developer.nvidia.com/nsight-visual-studio-edition-supported-gpus-full-list)
doesn’t include this specific card but does include the very closely related K1000M.

Configuration:

  • Window 10, 64bit
  • Visual Studio 2017, v15.4.5
  • CUDA 9.0.176 win10
  • NSight 5.4.0.17229

Same problem for all CUDA apps, including the Nsight Code Samples: CUDA Debugging - Matrix Multiply both Runtime and Driver versions.
I also tested with my own simple Hello World CUDA app:

foo.cu:

extern "C" void __global__ bar()
{
	//printf("Hello");
}

foo.cu is compiled using:

nvcc --cubin -G -gencode=arch=compute_30,code=sm_30 foo.cu

main.cpp (compiled using x64 Debug configuration):

int main()
{
	CUcontext context;
	CUdevice device;
	CUmodule fooModule;
	CUfunction barFunction;

	CHECK(cuInit(0));
	CHECK(cuDeviceGet(&device, 0));
	CHECK(cuCtxCreate(&context, 0, device));
	CHECK(cuModuleLoad(&fooModule, "foo.cubin"));
	CHECK(cuModuleGetFunction(&barFunction, fooModule, "bar"));
	CHECK(cuLaunchKernel(barFunction, 1, 1, 1, 1, 1, 1, 0, NULL, NULL, NULL));
	CHECK(cuCtxSynchronize());
}

The app runs fine if I start without using NSight (using standard Visual Studio debugger).

If I comment out the call to cuLaunchKernel then “Start CUDA Debugging” executes without error.

Please check the TDR setting in Nsight Monitor options.
Try to increase it and start debug again.

Thanks!

Changing the TDR setting fixed the CUDA grid launch failed problem.

I was however still having problems getting it to break at a break point.
I found the solution to that problem was to run the Nsight Monitor as Administrator.

Now it all works, thanks!

Thank you w.kelly, run Nsight Monitor as Administrator solves issue