Hi, all!
I have a question on concurrent kernel execution (3.2.5.3 in NVIDIA CUDA C Programming Guide Version 4.1).
Quote : “Some devices of compute capability 2.x can execute multiple kernels concurrently.”
What’s the exact meaning of “concurrently”? I have two alternatives.
a) in parallel. One kernel might not be able to use all the compute resources on a GPU, the other kernels can be scheduled to use the remaining resources at the same time.
b) the kernels are scheduled like multitasks on a single GPU. It seems that they are running simultaneously, but the executions of the kernels are not overlapped in the time line.
Which one is currect? or neither.
Thank you in advance!