I am in the process of writing a new Cuda program, and I am seeing a problem I have not
seen before in any of the previous programs I have written.
The symptom is that threadIdx.x, .y and blockIdx.x, .y, etc are always zero. The other builtin
variables seem to be fine, e.g. blockDim, GridDim, warpSize, etc.
Other programs that I have written seem to compile and write fine. My guess is that I am
somehow clobbering something in the kernel, or not initializing something, but its not
obvious. Has anyone else seen this, does anyone have any suggestions ?
I am running Cuda 3.0 on redhat enterprise 5.4 with a GEFORCE GTX 260 (216 core).
This problem appears to have something to do with using threadIdx, etc to initialize a
variable. If you do a normal assignment to a variable, everything seems to work. This
is Cuda 3.0.
int x = blockIdx.x; // does not work but
int x; x = blockIdx.x; // works.
This problem appears to have something to do with using threadIdx, etc to initialize a
variable. If you do a normal assignment to a variable, everything seems to work. This
is Cuda 3.0.
int x = blockIdx.x; // does not work but
int x; x = blockIdx.x; // works.
Could you post a small code example that has the same problem as your application? Device as well as host code would be useful here, as it does sound like the kernel itself is being optimized out.