confusion about 64 bit shared memory access

On the CUDA programming guide v4.2 section F.4.3.2, it says:

question 1: is “float” a typo? shouldn’t it be “double”?

question 2: does it imply that the 64 bit memory access request is for half-warp rather than the entire warp? otherwise, access to, say, shared[0] and shared [16] by thread 0 and 16 is supposed to incur bank conflict, right?

Thanks for clarification! External Image

Banks conflicts are looked at half-warp level.