confusion about 64 bit shared memory access

King_Crimson · May 6, 2012, 6:21pm

On the CUDA programming guide v4.2 section F.4.3.2, it says:

[i]64-Bit Accesses

For 64-bit accesses, a bank conflict only occurs if two threads in either of the half-warps access different addresses belonging to the same bank.

Unlike for devices of compute capability 1.x, there are no bank conflicts for arrays of doubles accessed as follows, for example: [/i]
extern __shared__ float shared[]; 

double data = shared[BaseIndex + tid];

question 1: is “float” a typo? shouldn’t it be “double”?

question 2: does it imply that the 64 bit memory access request is for half-warp rather than the entire warp? otherwise, access to, say, shared[0] and shared [16] by thread 0 and 16 is supposed to incur bank conflict, right?

Thanks for clarification! External Image

cudaDMA · May 10, 2012, 9:09pm

Banks conflicts are looked at half-warp level.