Reg: Options for changing L1 cache size in OPENCL
Hi all,
I have read that the L1 cache size in the fermi architecture ( GTX series in my case ) is configurable and from the 64KB available it can be programmed as 48 KB for shared memory and 16 KB for L1 cache or vice versa. I could not find a compiler flag or a command option to achieve this in OpenCL. Is this option not available in OpenCL ? Kindly clarify ..
Hi all,

I have read that the L1 cache size in the fermi architecture ( GTX series in my case ) is configurable and from the 64KB available it can be programmed as 48 KB for shared memory and 16 KB for L1 cache or vice versa. I could not find a compiler flag or a command option to achieve this in OpenCL. Is this option not available in OpenCL ? Kindly clarify ..

#1
Posted 11/29/2011 09:06 AM   
I was also wondering about this. Especially as the OpenCL Programming Guide for the CUDA Architecture explicitly states this possibility in section C.4.1, sadly leaving out the detail of how to utilize it.
I was also wondering about this. Especially as the OpenCL Programming Guide for the CUDA Architecture explicitly states this possibility in section C.4.1, sadly leaving out the detail of how to utilize it.

#2
Posted 03/14/2012 05:20 PM   
I am very much interested in this too. Should you come across the solution, please do post it. Thx.
I am very much interested in this too. Should you come across the solution, please do post it. Thx.

#3
Posted 03/15/2012 12:38 AM   
I'm interested too. I would like to test how my OpenCL kernels work with a 48KB-L1Cache configuration, but I didn't found any solution.
Thanks!
I'm interested too. I would like to test how my OpenCL kernels work with a 48KB-L1Cache configuration, but I didn't found any solution.

Thanks!

#4
Posted 04/17/2012 11:02 AM   
Could you please give us the description you found in the OpenCL Programming Guide for the CUDA Architecture?

All I could find in the section C4.1 around L1 cache is:
[indent]There is an L1 cache for each multiprocessor and an L2 cache shared by all multiprocessors, both of which are used to cache accesses to local or global memory, including temporary register spills.
[/indent]

It does not point out any opencl APIs...
Could you please give us the description you found in the OpenCL Programming Guide for the CUDA Architecture?



All I could find in the section C4.1 around L1 cache is:

[indent]There is an L1 cache for each multiprocessor and an L2 cache shared by all multiprocessors, both of which are used to cache accesses to local or global memory, including temporary register spills.

[/indent]



It does not point out any opencl APIs...

#5
Posted 07/02/2012 05:22 AM   
Sorry, I referenced the wrong section. In section 3.3 it states: [quote]As mentioned in Section C.4.1, for devices of compute capability 2.0, the same on-chip memory is used for both L1 and shared memory, and how much of it is dedicated to L1 versus shared memory is configurable for each kernel call.[/quote] It does not explicitly mention the OpenCL-API for that. But as it is the OpenCL Programming Guide, why else should it mention that capability at all?
Sorry, I referenced the wrong section. In section 3.3 it states:
As mentioned in Section C.4.1, for devices of compute capability 2.0, the same on-chip memory is used for both L1 and shared memory, and how much of it is dedicated to L1 versus shared memory is configurable for each kernel call.
It does not explicitly mention the OpenCL-API for that. But as it is the OpenCL Programming Guide, why else should it mention that capability at all?

#6
Posted 07/02/2012 07:24 AM   
Scroll To Top