The way to select the blockID at the host side and then execute at the Device.

Now, I’m using CUDA 7 ON CentOS6.7 for images processing.
I implemented as follow.
so, I think that CUDA Kernel method errors occurs.
the process don’t stop, but,
I can see such behavior that CUDA Kernel method didn’t move .

--------loop 1~3--------------------------------------------

  1. On the main process, create 3 threads.

  2. On the 1 CPU thread process 1 image.(that is 3 images processed at the same time)
    2-1. Execute cudaMalloc
    2-2. Execute cudaMemcpy (cudaMemcpyHostToDevice)
    2-3. CUDA dim3 blocks(1,1,1)
    2-4. CUDA dim3 threads(8,8,1)
    2-5. Execute CUDA Kernel method.
    2-6. waiting CUDA Kernel method finished (chudaThreadSynchronize)
    2-7. Execute cudaMemcpy (cudaMemcpyDeviceToHost)

  3. waiting finish the 3 CPU threads.


I tried that the 3 CPU threads ⇒1 CPU thread (this version, didn’t stop the Kernel).

I think that the each thread execute dim3 blocks and dim3 threads
causes this problem.

for the resolution of the above probrems,
I think that before start CPU threads, execute “dim3 blocks” and “dim3 threads”.

and , I want to know the way to use CUDA blocks as follow.
At the CPU thread 1 ⇒ At the GPU use block 1

At the CPU thread 2 ⇒ At the GPU use block 2

At the CPU thread 3 ⇒ At the GPU use block 3

In this way, I want to select the blockID at the host side and then execute at the Device.

please tell me how to use that Cuda.


global void
kernelFunction(int* inA, int* inB, int* inC)
{
int x = threadIdx.x;
int y = threadIdx.y;
int z = threadIdx.z;
:
:
}

void A_CPU_Thread ()
{
dim3 grid(1,1,1);
dim3 block(3, 1, 1);
dim3 thread(8,8,1);

kernelFunction<<<4,512>>>(A,B);//← I want to select the block ID at host side
                               //      A_CPU_Thread 1 ⇒ block ID 1
                               //      A_CPU_Thread 2 ⇒ block ID 2
                               //      A_CPU_Thread 3 ⇒ block ID 3                                                                      
:
:
:

}

so, the “A_CPU_Thread” is existed 3time at the same time in main method.
I want to select the block ID at host side
A_CPU_Thread 1 ⇒ block ID 1
A_CPU_Thread 2 ⇒ block ID 2
A_CPU_Thread 3 ⇒ block ID 3


a) you probably need proper error checking

“I think that” (as noted by you) is normally the 1st step
“i then tested” normally being the 2nd

b) quickest might be to step your program, including the kernel, in the debugger

cross posting:

[url]http://stackoverflow.com/questions/33886285/how-to-select-the-blockid-at-the-host-side[/url]