parallelizing slic superpixels algorithm on opencv-2-4

Same question I asked on StackOverflow http://stackoverflow.com/questions/35191932/parallelizing-slic-superpixels-algorithm-on-opencv-2-4
background

OpenCV 3.0 has a contrib module where SLIC superpixels are introduced. In the previous version, only SEEDS superpixel implementation was there.

I cannot jump to OpenCV 3.0 since I am working with Nvidia Jetson TX1 which is CPU optimized with a closed source version of OpenCV 2.4.10 .

I intend to take benefit of the highly parallel architecture of the Jetson. I got a simple SLIC superpixel code working with OpenCV 2.4.

problem

My question is: how to take benefit of the OpenCV GPU class for parallelizing a bunch of nested for loops [ which is happening in my case]

Basically I have something like this :

for(;;) // reading video frame by frame
{
  Mat im;
  cap >> im;
  ...
  ...
  for(int i=0; i<5; i++) //Localized K-means. 5 iterations is enough for my purpose.
  {
    for(cluster_no=0; cluster_no<total_clusters; cluster_no++)
    {
       ...
       // Determine length and width of the current cluster.
       ...
       for(each pixel in the current cluster span)
       {
         // Compute distance between current cluster center and current pixel
         ...
       }
    }
  } 
  // Compute new cluster centres based on the pixel distribution.
  // Update the clusters.
}

Looking at the OpenCV 3.0 implementation https://github.com/Itseez/opencv_contrib/blob/master/modules/ximgproc/src/slic.cpp#L1217 (line 1217), they too have optimised this using the parallel_for function of TBB. Is there a way to do this using the OpenCV 2.4 GPU class. Or if there’s not, does that mean I have to use CUDA explicitly.

  • Parallelizing SLIC & LSC for CUDA is very possible, in fact all routins can be made. As code author within opencv_contrib I plan to work on it and make a variant for CUDA too right in opencv_contrib.

Having compared performances of my application with closed_source Library opencv-2.4.12(.3) and open source opencv-3.1.0, there are some parts that run faster on opencv-3.1.0 (while for some others, the tegra_optimzed closed source still performs better).

So, giving a trial to opencv-3.1.0 could be an alternative, depending on your application.