What is an OpenCL Kernel ID?

Absolute OpenCL/CUDA beginner here. I’m currently trying to speed up my OpenCV 3.2 code by using the OpenCL implementation thru TAPI. While profiling with Nsight, I noticed that in the “OpenCL Kernel Launch Commands” view, every call of an OCL kernel is showing a distinct Kernel ID, e.g. 1st, 2nd, 3rd call of “morph” show Kernel ID 1, 3, 5, respectively. This is in contrast to other OpenCL code, for example when I run the ToneMapping sample of the Intel OpenCL samples, the Kernel ID of “ToneMappingPerPixel” is always 1, no matter how often I call it. Now I am unable to find out what a “Kernel ID” is in the first place (nor in the OpenCL specification, nor in the Nsight documentation, nor in the developer forum here) and when it is issued. I would expect a kernel to get a ID associated at creation/compilation time and I’m wondering if this hints towards a possible accidental re-compilation of kernels in each loop.

Hello iko79,

Could you give me the sample you use, then I can try to repo it on my computer.

Best Regards
Harry

Hi,

here’s the code:

cv::UMat m1;
cv::UMat m2;
cv::Mat kernel = cv::getStructuringElement( cv::MORPH_ELLIPSE, cv::Size( 5, 5 ), cv::Point( 3, 3 ) );

cv::imread( "frame.tiff" ).copyTo( m1 );

for( int i = 0; i < 10; i++ )
{
	cv::morphologyEx( m1, m1, cv::MORPH_DILATE, kernel );
	cv::cvtColor( m1, m2, cv::COLOR_RGB2GRAY );
}

The Intel sample I was referring to can be downloaded here

Best,
iko

Hi,

I just find out what’s going on as I’m also not familiar with OpenCL, hope my answer can help you.

Kernel ID means the id of opencl kernel you create, e.g. the ToneMappingPerPixel in your samples.

//This create the kernel 1
OpenCLProgramOneKernel exec(ocl, L"ToneMapping.cl", "", "ToneMappingPerPixel", "-cl-fast-relaxed-math -cl-denorms-are-zero");

When you call OpenCLProgramOneKernel twice, then two kernel will be created.

//This create the kernel 1
OpenCLProgramOneKernel exec(ocl, L"ToneMapping.cl", "", "ToneMappingPerPixel", "-cl-fast-relaxed-math -cl-denorms-are-zero");
//This create the kernel 2
OpenCLProgramOneKernel exec1(ocl, L"ToneMapping.cl", "", "ToneMappingPerPixel", "-cl-fast-relaxed-math -cl-denorms-are-zero");

But how about this?

//This create the kernel 1
OpenCLProgramOneKernel* exec = new OpenCLProgramOneKernel(ocl, L"ToneMapping.cl", "", "ToneMappingPerPixel", "-cl-fast-relaxed-math -cl-denorms-are-zero");
delete exec
//The kernel id of exec1 also should be 2
OpenCLProgramOneKernel exec1(ocl, L"ToneMapping.cl", "", "ToneMappingPerPixel", "-cl-fast-relaxed-math -cl-denorms-are-zero");

Let’s go back to your OpenCV code.

for( int i = 0; i < 10; i++ )
{
        // I guess each time morphologyEx creates two opencl kernels e.g. 1,2
	cv::morphologyEx( m1, m1, cv::MORPH_DILATE, kernel );
        // After execution, these two kernels are destroyed, in next loop, two new kernel will be created as 3,4
	cv::cvtColor( m1, m2, cv::COLOR_RGB2GRAY );
}

I didn’t check the source of OpenCV but I guess it’s a bad idea to call morphologyEx in a loop, the same kernel will be created and destroyed again and again.

I just add a loop in ToneMapping then get the similar result, please check the attached picture.

Thanks for your investigations and for verifying my observations. I put this on the OpenCV Q&A forum, let’s see what they have to say. I try to come back to this topic when I find enough time to dig into the OpenCV code.