why gpu memory latency so high

I wonder that why GPU memory latency is so high (around 400 cycles)? does any one can give me some ideas? thanks very much.

It’s not just Memory latency, there’s a chain of latencies on modern GPU:

    [*]L1 cache latency

    [*]L2 cache latency

    [*]then communication with memory controller(S) latency

    [*]and the basic memory itself, that is conceived for high throuput not for low latency

At some point, you might have thousands of memory read and write operations in-flight, waiting to be processed by the memory controllers, and the whole global memory subsystem is designed to give the best throughput. Knowing all operations that may be waiting for memory at any point during execution, I find the latencies (usually 300-1000 cycles) to be low in fact!

It’s not high as a ddr memory. DDR memory latency is always high as there is a lot of overhead to reading a memory line.

CPUs have larger caches and lower parallelism to compensate. GPU depends on latency hiding rather than large caches so you need to allow it to work.