Why the 64bit addAtomic error?

sedona · July 21, 2017, 6:43am

from the snippet below, can anyone please tell me why the second atomicAdd raises misalignment errors? I’m on a 6.1 compute GPU, so the 64 bit atomic add should be ok. (vs2015sp3, win7)

global void why(
double *dd,
float *ff
)
{

float thingFloat = 3.0f;
atomicAdd(ff, thingFloat);    // works fine

double thingDouble = 3.0f;
atomicAdd(dd, thingDouble);   // raises misaligned atomic error

}

main()
{
double *dd;
cudaMalloc((void **)&dd, 100 * sizeof(double));
float *ff;
cudaMalloc((void **)&ff, 100 * sizeof(float));

why <<<1, 1 >>> (dd, ff);

}

Robert_Crovella · July 21, 2017, 11:33am

does it also raise this error if you run the compiled executable outside of visual studio, i.e. at the windows command prompt, using cuda-memcheck ?

BulatZiganshin · July 21, 2017, 12:08pm

add printing of variable address, in both functions, to check it yourself

Robert_Crovella · July 22, 2017, 1:01am

Also, I would always recommend adding a cudaDeviceSynchronize() after the kernel call in a test code like that.

As written, your code will allow application tear-down to begin while the kernel is still executing. This can lead to unpredictable results.

I ran your code on linux, CUDA 8.0.61 and can’t reproduce any issues. Other than the comment about cudaDeviceSynchronize(), I don’t believe there should be any issues with your code.

pasoleatis · July 25, 2017, 9:10pm

from the snippet below, can anyone please tell me why the second atomicAdd raises misalignment errors? I’m on a 6.1 compute GPU, so the 64 bit atomic add should be ok. (vs2015sp3, win7)

global void why(
double *dd,
float *ff
)
{
float thingFloat = 3.0f;
atomicAdd(ff, thingFloat);    // works fine

double thingDouble = 3.0f;
atomicAdd(dd, thingDouble);   // raises misaligned atomic error
}

main()
{
double *dd;
cudaMalloc((void **)&dd, 100 * sizeof(double));
float *ff;
cudaMalloc((void **)&ff, 100 * sizeof(float));
why &lt;&lt;&lt;1, 1 &gt;&gt;&gt; (dd, ff);
}

What flags do you use for compiling?