why gpu memory latency so high
I wonder that why GPU memory latency is so high (around 400 cycles)? does any one can give me some ideas? thanks very much.
I wonder that why GPU memory latency is so high (around 400 cycles)? does any one can give me some ideas? thanks very much.

#1
Posted 05/05/2012 06:11 AM   
It's not just Memory latency, there's a chain of latencies on modern GPU:
[list]
[*]L1 cache latency
[*]L2 cache latency
[*]then communication with memory controller(S) latency
[*]and the basic memory itself, that is conceived for high throuput not for low latency
[/list]

At some point, you might have thousands of memory read and write operations in-flight, waiting to be processed by the memory controllers, and the whole global memory subsystem is designed to give the best throughput. Knowing all operations that may be waiting for memory at any point during execution, I find the latencies (usually 300-1000 cycles) to be low in fact!
It's not just Memory latency, there's a chain of latencies on modern GPU:


  • L1 cache latency
  • L2 cache latency
  • then communication with memory controller(S) latency
  • and the basic memory itself, that is conceived for high throuput not for low latency




At some point, you might have thousands of memory read and write operations in-flight, waiting to be processed by the memory controllers, and the whole global memory subsystem is designed to give the best throughput. Knowing all operations that may be waiting for memory at any point during execution, I find the latencies (usually 300-1000 cycles) to be low in fact!

Parallelis.com, Parallel-computing technologies and benchmarks. Current Projects: OpenCL Chess & OpenCL Benchmark

#2
Posted 05/15/2012 08:33 PM   
It's not high as a ddr memory. DDR memory latency is always high as there is a lot of overhead to reading a memory line.

CPUs have larger caches and lower parallelism to compensate. GPU depends on latency hiding rather than large caches so you need to allow it to work.
It's not high as a ddr memory. DDR memory latency is always high as there is a lot of overhead to reading a memory line.



CPUs have larger caches and lower parallelism to compensate. GPU depends on latency hiding rather than large caches so you need to allow it to work.

#3
Posted 05/22/2012 12:44 PM   
Scroll To Top