HP Server ProLiant DL360 Gen9 vs IBM System x3850 X5 ==> Numa Processor group usage

central-processing-unitdriversnumawindows-server-2012-r2

The same C# executable programmed to run on every nodes have those different behavior:

  • HP: Run on one node only (one processorGroup) (any one of the 2). Problem: it suppose to run on every nodes.
  • IBM: Run on all nodes (every processorGroup)

Both machine run windows Server 2012 R2 and have more than 2 cpus.

HP – 2x Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 2597 Mhz, 14 Core(s), 28 Logical Processor(s)
IBM – 4x Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz, 2395 Mhz, 10 Core(s), 20 Logical

I try to answer my own question at stackoverflow. All details could be taken at this link.

To my opinion, it seems to point to a faulty driver on HP server or a configuration in the bios or windows.

Any idea what could cause that exactly ?

HP MsInfo32 dump:

OS Name            Microsoft Windows Server 2012 R2 Standard
Version               6.3.9600 Build 9600
Other OS Description    Not Available
OS Manufacturer            Microsoft Corporation
System Name   EMTP6
System Manufacturer   HP
System Model  ProLiant DL360 Gen9
System Type     x64-based PC
System SKU       755258-B21
Processor           Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 2597 Mhz, 14 Core(s), 28 Logical Processor(s)
Processor           Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 2597 Mhz, 14 Core(s), 28 Logical Processor(s)
BIOS Version/Date         HP P89, 7/11/2014
SMBIOS Version              2.8
Embedded Controller Version 2.02
BIOS Mode         UEFI
Platform Role   Enterprise Server
Secure Boot State           Off
PCR7 Configuration       Not Available
Windows Directory        ---removed
System Directory            ---removed
Boot Device       \Device\HarddiskVolume2
Locale   United States
Hardware Abstraction Layer      Version = "6.3.9600.17196"
User Name         Not Available
Time Zone          Eastern Standard Time
Installed Physical Memory (RAM)          256 GB
Total Physical Memory 256 GB
Available Physical Memory       246 GB
Total Virtual Memory   294 GB
Available Virtual Memory          283 GB
Page File Space               38.0 GB
Page File             ---removed
Hyper-V - VM Monitor Mode Extensions            Yes
Hyper-V - Second Level Address Translation Extensions             Yes
Hyper-V - Virtualization Enabled in Firmware  Yes
Hyper-V - Data Execution Protection    Yes

IBM MsInfo32 dump:

OS Name Microsoft Windows Server 2012 R2 Standard
Version 6.3.9600 Build 9600
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Manufacturer IBM
System Model System x3850 X5
System Type x64-based PC
System SKU
Processor Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz, 2395 Mhz, 10 Core(s), 20 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz, 2395 Mhz, 10 Core(s), 20 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz, 2395 Mhz, 10 Core(s), 20 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz, 2395 Mhz, 10 Core(s), 20 Logical Processor(s)
BIOS Version/Date IBM Corp. -[G0E179BUS-1.79]-, 28-07-2013
SMBIOS Version 2.7
Embedded Controller Version 255.255
BIOS Mode UEFI
BaseBoard Manufacturer IBM
BaseBoard Model Not Available
BaseBoard Name Base Board
Platform Role Enterprise Server
Secure Boot State Unsupported
PCR7 Configuration Not Available
Hardware Abstraction Layer Version = "6.3.9600.17031"
User Name Not Available
Time Zone Romance Standard Time
Installed Physical Memory (RAM) 128 GB
Total Physical Memory 128 GB
Available Physical Memory 53,0 GB
Total Virtual Memory 147 GB
Available Virtual Memory 67,7 GB
Hyper-V - VM Monitor Mode Extensions Yes
Hyper-V - Second Level Address Translation Extensions Yes
Hyper-V - Virtualization Enabled in Firmware Yes
Hyper-V - Data Execution Protection Yes

Best Answer

The bug has been (partly) fixed by a new yet unpublished HP Bios (at the time of writing this).

The new Bios (targeting HP Proliant DL360 and DL380 Gen9) introduce a new setting: "NUMA Group Size Optimizations" with choice of [Clustered - default] or [Flat]. HP says to set it to flat.

As far as I know, the OS communicate with the BIOS to know the CPU(s) configuration. The Bios play an important role in how the OS will present the logical processors available to applications (Processor Group, Affinity, etc).

I think the bug only fixed partly the problem. This is why:

  • There is only one processor group where I thought it would have been better to have one per numa node.
  • Also running a busy thread (working 100% time) per logical processor make all (of all nodes) logical processors being busy only at ~40%. I expected 100% usage.
  • I highly thing that HP will release another BIOS that will correct that situation (either the only one group and the ~40% usage).

enter image description here