Where can I know, which max. tiling size in a block can be supported by my graphic card?
(How much data can I copy from device memory to shared memory at once?)
The parallel computations in cuda can run just in a block (threads parallelism).
The calculation of blocks is still sequential. It means, a block will run on GPU
just after complete work of another block.
Department of Mathematics, Tsing Hua university, R.O.C.
Lung Sheng Chien
You must log in to send a PM.
Please Login | Register to add a comment.
Not a member? Register Now