Hi all,
I’m having issues with atomicMax on the latest CUDA 7.
This happens when compiling my code with -O0 -G and setting some --maxrregcount.
My setup is:
[dcampora@gpu01 trackfollowing]$ uname -a
Linux gpu01 2.6.32-504.16.2.el6.x86_64 #1 SMP Tue Apr 21 21:44:51 CEST 2015 x86_64 x86_64 x86_64 GNU/Linux
[dcampora@gpu01 trackfollowing]$ nvidia-smi
Thu May 28 06:05:24 2015
+------------------------------------------------------+
| NVIDIA-SMI 346.46 Driver Version: 346.46 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 980 Off | 0000:03:00.0 N/A | N/A |
| 0% 33C P0 N/A / N/A | 15MiB / 4095MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
Here’s a cuda-gdb run of my code:
(cuda-gdb) n
505 atomicMax(max_numhits_to_process, num_h1_to_process);
(cuda-gdb) n
508 __syncthreads();
(cuda-gdb) n
warning: Warp(s) other than the current warp had to be single-stepped.
511 for (int j=0; j<((int) ceilf(((float) (max_numhits_to_process[0])) / blockDim.y)); ++j) {
(cuda-gdb) p max_numhits_to_process
$6 = (@generic unsigned int * @register) 0x704f00018
(cuda-gdb) p max_numhits_to_process[0]
$7 = 0
(cuda-gdb) p num_h1_to_process
$8 = 19
Any clues?