understanding L2 requests
Hi,

I used visual profiler on mac to profile a sparse matrix vector multiplication kernel. I found that the number of L1 misses * 15(num. of SMs) is not equal to the number of L2 requests. Even (num. of L1 misses + num. of L1 hits) * 15 < L2 requests. Can someone explain this?

L1 hits: 677342
L1 misses: 2.07111e+06
L2 requests: 1.23936e+08
Hi,



I used visual profiler on mac to profile a sparse matrix vector multiplication kernel. I found that the number of L1 misses * 15(num. of SMs) is not equal to the number of L2 requests. Even (num. of L1 misses + num. of L1 hits) * 15 < L2 requests. Can someone explain this?



L1 hits: 677342

L1 misses: 2.07111e+06

L2 requests: 1.23936e+08

#1
Posted 05/02/2012 03:23 PM   
[quote name='bowuwm' date='02 May 2012 - 08:23 AM' timestamp='1335972228' post='1403440']
Hi,

I used visual profiler on mac to profile a sparse matrix vector multiplication kernel. I found that the number of L1 misses * 15(num. of SMs) is not equal to the number of L2 requests. Even (num. of L1 misses + num. of L1 hits) * 15 < L2 requests. Can someone explain this?

L1 hits: 677342
L1 misses: 2.07111e+06
L2 requests: 1.23936e+08
[/quote]

You have 60x more L2 requests than L1 misses. 60 is 15 times 4. 15 is the number of SMs as you noted, and I'd speculate that 4 is due to the 4x difference between L2 and L1 cache line sizes.
[quote name='bowuwm' date='02 May 2012 - 08:23 AM' timestamp='1335972228' post='1403440']

Hi,



I used visual profiler on mac to profile a sparse matrix vector multiplication kernel. I found that the number of L1 misses * 15(num. of SMs) is not equal to the number of L2 requests. Even (num. of L1 misses + num. of L1 hits) * 15 < L2 requests. Can someone explain this?



L1 hits: 677342

L1 misses: 2.07111e+06

L2 requests: 1.23936e+08





You have 60x more L2 requests than L1 misses. 60 is 15 times 4. 15 is the number of SMs as you noted, and I'd speculate that 4 is due to the 4x difference between L2 and L1 cache line sizes.

#2
Posted 05/02/2012 09:01 PM   
[quote name='vvolkov' date='02 May 2012 - 10:01 PM' timestamp='1335992508' post='1403556']
You have 60x more L2 requests than L1 misses. 60 is 15 times 4. 15 is the number of SMs as you noted, and I'd speculate that 4 is due to the 4x difference between L2 and L1 cache line sizes.
[/quote]

Your speculation is quite reasonable. I think that is the reason. Thanks!
[quote name='vvolkov' date='02 May 2012 - 10:01 PM' timestamp='1335992508' post='1403556']

You have 60x more L2 requests than L1 misses. 60 is 15 times 4. 15 is the number of SMs as you noted, and I'd speculate that 4 is due to the 4x difference between L2 and L1 cache line sizes.





Your speculation is quite reasonable. I think that is the reason. Thanks!

#3
Posted 05/03/2012 03:03 AM   
Scroll To Top