Kernel launch overhead for GTX 680
Hi
Does anyone know what the kernel launch overhead is for the GTX 680?

I know the GTX 580 is quoted as having an approximately 5us launch overhead ...
Hi

Does anyone know what the kernel launch overhead is for the GTX 680?



I know the GTX 580 is quoted as having an approximately 5us launch overhead ...

#1
Posted 04/14/2012 06:37 PM   
Launch overhead is a function of hardware platform configuration, software stack configuration including operating system, as well as kernel complexity. I recently measured 3us as the minimum launch overhead (i.e. for empty kernels) for a GTX 680 on an older workstation with PCIe2 running Linux64.

Unless kernel runtime is very small, kernel launch overhead is typically not critical to application performance. If this is a concern for your application I would suggest setting up a quick test. In case you are using a Windows version younger than Windows XP, please note that the WDDM driver model incurs a lot of overhead which the CUDA driver tries to alleviate partially by batching launches. Overhead is much reduced with the TCC driver.
Launch overhead is a function of hardware platform configuration, software stack configuration including operating system, as well as kernel complexity. I recently measured 3us as the minimum launch overhead (i.e. for empty kernels) for a GTX 680 on an older workstation with PCIe2 running Linux64.



Unless kernel runtime is very small, kernel launch overhead is typically not critical to application performance. If this is a concern for your application I would suggest setting up a quick test. In case you are using a Windows version younger than Windows XP, please note that the WDDM driver model incurs a lot of overhead which the CUDA driver tries to alleviate partially by batching launches. Overhead is much reduced with the TCC driver.

#2
Posted 04/14/2012 08:23 PM   
Thanks

I also get an overhead of approximately 3.4us for an empty kernel launch over 1m runs on a 680.
Thanks



I also get an overhead of approximately 3.4us for an empty kernel launch over 1m runs on a 680.

#3
Posted 05/19/2012 10:35 AM   
Scroll To Top