I am new to using Nsight for Visual Studio and am having a problem displaying dynamically allocated shared variables. When performing CUDA debugging in Visual Studio while in a kernel, I am unable to watch or use the memory window to display values for dynamically allocated shared variables.
In the following code I can monitor the value for s_static when in the first kernel through breakpoints and the memory window, but I can’t monitor the value for s_dyanmic through breakpoints or the memory window. The value returned for d_val after the kernel launch is correct in either case.
__global__ void SharedDisplayStatic (int* d_val)
{
__shared__ int s_static[1];
if (threadIdx.x == 0)
{
s_static[0] = threadIdx.x;
*d_val = s_static[0];
}
}
__global__ void SharedDisplayDynamic (int* d_val)
{
extern __shared__ int s_dynamic[];
if (threadIdx.x == 0)
{
s_dynamic[0] = threadIdx.x;
*d_val = s_dynamic[0];
}
}
void HostCallingFunction()
{
int* d_val(NULL);
int val(1);
cudaMalloc(&d_val, sizeof(int));
cudaMemset(d_val, 1, sizeof(int));
SharedDisplayStatic <<<1, 1>>> (d_val);
cudaMemcpy(&val, d_val, sizeof(int), cudaMemcpyDeviceToHost);
cudaMemset(d_val, 1, sizeof(int));
SharedDisplayDynamic <<<1, 1, 4>>> (d_val);
cudaMemcpy(&val, d_val, sizeof(int), cudaMemcpyDeviceToHost);
}
Here’s my setup:
Windows 7
Visual Studio 2010, SP1
Nsight Version 3.0, Build 3.0.0.13079 (release candidate 2) – I had same behavior with RC1
Cuda Toolkit 5.0
Driver 314.07
GTX 460 (Compute capability 2.1)
Setup for headless debugging on the GTX 460
I appreciate any help.
David