Server Motherboards Memory Per CPU

hardwarememory

I noticed that on the new dual socket 1366 server type motherboards there are two banks of RAM. Does this mean that if I have 72GB of RAM installed that Windows will only allow 36GB per processor or will one processor have access to all 72GB?

Best Answer

A Dual socket board will be configured with two CPU systems that includes memory slots associated to each socket. If there are two memory banks each will be wired to a CPU slot. The memory bank will not be directly available for the other slot.

That implies a motherboard with 72GB capacity has 36GB per CPU SLOT capacity.
However, if your DIMMs are asymmetrically setup like in this Intel board,
I suspect you will have 24GB on one CPU and 48GB on the other... need to confirm that.

If you are referring to a Nehalem based 1366 board, you will get a setup of 3 memory slots per CPU slot. You will populate 3xDDR3 DIMMs to get your per-cpu memory.

Nehalem architecture does better access of memory from the other slot bank using Non Uniform Memory Architecture (NUMA).

NUMA attempts to address this problem by providing separate memory for each processor, avoiding the performance hit when several processors attempt to address the same memory. For problems involving spread data (common for servers and similar applications), NUMA can improve the performance over a single shared memory by a factor of roughly the number of processors (or separate memory banks).

Of course, not all data ends up confined to a single task, which means that more than one processor may require the same data. To handle these cases, NUMA systems include additional hardware or software to move data between banks. This operation has the effect of slowing down the processors attached to those banks, so the overall speed increase due to NUMA will depend heavily on the exact nature of the tasks run on the system at any given time.


When you are not using Nehalem NUMA, the older scheme works differently, a quick difference is visually shown in this ArsTechnica article page. Basically, you have the worst case access time for everything (multi-socket memory access with the full cost of multiway access).

The NUMA technique allows better access times across banks. The final result is better memory throughput, particularly when each processor slot has its data localized in its bank.


I am not yet confident about all points of this answer and invite other opinions.

Related Topic