My cuda program has a strange behaviour.
I detected a memory problem with the tool cuda-memcheck. So I ran the program under cuda-gdb with the cuda memcheck enabled in order to better understand the problem and it appeared the following error:
Program received signal CUDA_EXCEPTION_10, Device Illegal Address.
The thing that I don’t understand is that the instruction which generates this error is a simple addition between a local and a shared scalar variables (no arrays, no pointers).
If I check the stack of kernel, for one of these variable there is the following warning:
warning: Variable is not live at this point. Returning garbage value.
I initialized the local variables in declaration and the shared variables at the begin of kernel with the code:
if(tid==0){
// initialization
}
__syncthreads();
Can anyone explain me the reason of this behaviour? (or give me some suggestions?)
I use cuda 4.0 on a tesla M2070 (I’m not the administrator of the system so I cannot update it to the version 4.1 or 4.2, but if I want I can use also the cuda 3.2 or 3.1). The operating system is Linux Red Hat 5.5 Enterprise 64 bit.