How to quantify speed FLOPs integer and logic operations per second
I have seen articles where people posted the speed of their kernel in terms of FLOPs. How do people estimate the speed of their kernels?
I am dealing with kernels which have no floating point numbers but instead the computational part is mostly logical operations like AND and exclusive or (XOR).
Is there an equivalent of FLOPs?
Is there a convient way to estimate non-FLOP performance?

Many thanks
Bill
I have seen articles where people posted the speed of their kernel in terms of FLOPs. How do people estimate the speed of their kernels?

I am dealing with kernels which have no floating point numbers but instead the computational part is mostly logical operations like AND and exclusive or (XOR).

Is there an equivalent of FLOPs?

Is there a convient way to estimate non-FLOP performance?



Many thanks

Bill

#1
Posted 09/14/2011 09:52 AM   
Count the number of useful operations your kernel performs in each thread, multiply by the number of threads, and divide by the number of seconds the kernel took to complete. The theoretical "operation/second" rate is 1/2 of the FLOP rate because the FLOP rate counts FMA which performs two floating point operations in one clock.
Count the number of useful operations your kernel performs in each thread, multiply by the number of threads, and divide by the number of seconds the kernel took to complete. The theoretical "operation/second" rate is 1/2 of the FLOP rate because the FLOP rate counts FMA which performs two floating point operations in one clock.

#2
Posted 09/14/2011 10:37 AM   
[quote name='DrAnderson42' date='14 September 2011 - 11:37 AM' timestamp='1315996628' post='1292706']
Count the number of useful operations your kernel performs in each thread, multiply by the number of threads, and divide by the number of seconds the kernel took to complete. The theoretical "operation/second" rate is 1/2 of the FLOP rate because the FLOP rate counts FMA which performs two floating point operations in one clock.
[/quote]
Ok. I was hopping there might be a short cut or tool to do this.
Thanks
Bill
[quote name='DrAnderson42' date='14 September 2011 - 11:37 AM' timestamp='1315996628' post='1292706']

Count the number of useful operations your kernel performs in each thread, multiply by the number of threads, and divide by the number of seconds the kernel took to complete. The theoretical "operation/second" rate is 1/2 of the FLOP rate because the FLOP rate counts FMA which performs two floating point operations in one clock.



Ok. I was hopping there might be a short cut or tool to do this.

Thanks

Bill

#3
Posted 09/14/2011 12:50 PM   
I've become quite enamored with the Visual Compute Profiler. After running my test program 15 times to capture all the profiling data, I can look at some very interesting reports summarizing instruction throughput and memory throughput. Plus, it does seem to perform a simple "expert systems" analysis of the results and make some generic suggestions. I would highly recommend taking a look if you want to get more useful performance numbers.
I've become quite enamored with the Visual Compute Profiler. After running my test program 15 times to capture all the profiling data, I can look at some very interesting reports summarizing instruction throughput and memory throughput. Plus, it does seem to perform a simple "expert systems" analysis of the results and make some generic suggestions. I would highly recommend taking a look if you want to get more useful performance numbers.

#4
Posted 09/14/2011 01:40 PM   
Scroll To Top