Electronic – How does CPU read data from the RAM

cachecpuram

In a general purpose computer(like normal pc), how does the CPU read the RAM, assuming that it first reads from the Cache.

Assuming the cache is an n-way set associative cache. Again, we would have L1 Cache and L2 Cache.

  1. Usually, CPU reads a block of data from the RAM. So these blocks can be several words. Doesn't this cost a lot of time/clock cycles? How is/can this be made more efficient? (Not talking about using cache here, but about the data transfer)

  2. RAM is quite slower than the CPU, how does the CPU still manage to be efficient?

Best Answer

There is throughput and latency.

On very simple, slow cores, the cache runs at the same speed as the CPU and can provide data in 1 cycle, so data is available immediately without stalling. When there is a cache miss, data is taken from main memory, and initial latency can be over 10 cycles. The good thing is that once the first data is available, the following data can be obtained quickly, hence the idea of burst transfers and cache fills. The CPU only needs a byte, or a 32bits word, but 32 or 64 bytes are transferred at once from memory to the cache.

On more advanced CPUs, the ones with L1, L2, DRAM and gigahertz clock, even the L1 cache contents cannot be obtained immediately. For instruction, there are mechanisms for predicting the instruction flow and fetching instructions in advance : Continuously fetch consecutive addresses, unless the instruction is a branch, a call,... For data, it is more complex. Using pipelining, some CPUs are able to have several outstanding data transfers before stalling. The real current solution for mitigating long latencies is out of order execution, the CPU does as much work as possible, even executing instructions not in program order, in order to hide the long latency of instructions like data reads and writes.