Hi, everyone.
I wanted to test texture cache line size on Pascal GPU, so I wrote some simple code to read a 1D array to register and write to another array.
__global__ void cacheLineTest(const float* src_, float* des_, unsigned int stride){
int tid = blockIdx.x*blockDim.x+threadIdx.x*stride;
des_[tid] = src_[tid];
}
and I got following results.
External Media
I have three questions:
- From these results, can I tell that 1D texture cache line of Pascal is 32 bytes?
- What does gld_transaction really mean? Why gld_transaction differ from L2_tex_read_transaction when stride is 1 and 3.
- I neither use restrict nor ldg(), why my load request still went through Texture + L2 path.
Thanks