cudaMemcpy - consuming lot of execution time
I am using cudaMemcpy to copy an image to cuda kernel for execution and it is consuming around 78% of the API calls time. I checked it using nvprof. Is there a way I can reduce this overhead or is there any other API to execute my application faster.
I am using cudaMemcpy to copy an image to cuda kernel for execution and it is consuming around 78% of the API calls time. I checked it using nvprof. Is there a way I can reduce this overhead or is there any other API to execute my application faster.

#1
Posted 04/16/2018 10:18 AM   
Scroll To Top

Add Reply