Linux – “power limit notification” clobbering on 12G Dell servers with RHEL6

Server: Poweredge r620
OS: RHEL 6.4
Kernel: 2.6.32-358.18.1.el6.x86_64

I'm experiencing application alarms in my production environment. Critical CPU hungry processes are being starved of resources and causing a processing backlog. The problem is happening on all the 12th Generation Dell servers (r620s) in a recently deployed cluster. As near as I can tell, instances of this happening are matching up to peak CPU utilization, accompanied by massive amounts of "power limit notification" spam in dmesg. An excerpt of one of these events:

Nov  7 10:15:15 someserver [.crit] CPU12: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU0: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU6: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU14: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU18: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU2: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU4: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU16: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU0: Package power limit notification (total events = 11)
Nov  7 10:15:15 someserver [.crit] CPU6: Package power limit notification (total events = 13)
Nov  7 10:15:15 someserver [.crit] CPU14: Package power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU18: Package power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU20: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU8: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU2: Package power limit notification (total events = 12)
Nov  7 10:15:15 someserver [.crit] CPU10: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU22: Core power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU4: Package power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU16: Package power limit notification (total events = 13)
Nov  7 10:15:15 someserver [.crit] CPU20: Package power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU8: Package power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU10: Package power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU22: Package power limit notification (total events = 14)
Nov  7 10:15:15 someserver [.crit] CPU15: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU3: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU1: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU5: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU17: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU13: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU15: Package power limit notification (total events = 375)
Nov  7 10:15:15 someserver [.crit] CPU3: Package power limit notification (total events = 374)
Nov  7 10:15:15 someserver [.crit] CPU1: Package power limit notification (total events = 376)
Nov  7 10:15:15 someserver [.crit] CPU5: Package power limit notification (total events = 376)
Nov  7 10:15:15 someserver [.crit] CPU7: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU19: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU17: Package power limit notification (total events = 377)
Nov  7 10:15:15 someserver [.crit] CPU9: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU21: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU23: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU11: Core power limit notification (total events = 369)
Nov  7 10:15:15 someserver [.crit] CPU13: Package power limit notification (total events = 376)
Nov  7 10:15:15 someserver [.crit] CPU7: Package power limit notification (total events = 375)
Nov  7 10:15:15 someserver [.crit] CPU19: Package power limit notification (total events = 375)
Nov  7 10:15:15 someserver [.crit] CPU9: Package power limit notification (total events = 374)
Nov  7 10:15:15 someserver [.crit] CPU21: Package power limit notification (total events = 375)
Nov  7 10:15:15 someserver [.crit] CPU23: Package power limit notification (total events = 374)

A little Google Fu reveals that this is typically associated with the CPU running hot, or voltage regulation kicking in. I don't think that's what is happening though. Temperature sensors for all servers in the cluster are running fine, Power Cap Policy is disabled in the iDRAC, and my System Profile is set to "Performance" on all of these servers:

# omreport chassis biossetup | grep -A10 'System Profile'
System Profile Settings
------------------------------------------
System Profile                                    : Performance
CPU Power Management                              : Maximum Performance
Memory Frequency                                  : Maximum Performance
Turbo Boost                                       : Enabled
C1E                                               : Disabled
C States                                          : Disabled
Monitor/Mwait                                     : Enabled
Memory Patrol Scrub                               : Standard
Memory Refresh Rate                               : 1x
Memory Operating Voltage                          : Auto
Collaborative CPU Performance Control             : Disabled

A Dell mailing list post describes the symptoms almost perfectly. Dell suggested that the author try using the Performance profile, but that didn't help. He ended up applying some settings in Dell's guide for configuring a server for low latency environments and one of those settings (or a combination thereof) seems to have fixed the problem.
Kernel.org bug #36182 notes that power-limit interrupt debugging was enabled by default, which is causing performance degradation in scenarios where CPU voltage regulation is kicking in.
A RHN KB article (RHN login required) mentions a problem impacting PE r620 and r720 servers not running the Performance profile, and recommends an update to a kernel released two weeks ago. …Except we are running the Performance profile…

Everything I can find online is running me in circles here. What's the heck is going on?

Best Answer

It's not the voltage regulation that causes the performance problem, but the debugging kernel interrupts that are being triggered by it.

Despite some misinformation on Redhat's part, all of the linked pages are referring to the same phenomenon. The voltage regulation happens with or without the Performance profile, likely due to the Turbo Boost feature being enabled. Regardless of reason, these voltage fluctuations are interacting poorly with the power-limit kernel interrupts that are enabled by default in kernel 2.6.32-358.18.1.el6.x86_64.

Confirmed Workarounds:

Upgrading to the most recently released Redhat kernel (2.6.32-358.23.2.el6) disables this debugging and eliminates the performance problem.
Adding the following kernel parameters to grub.conf will disable PLNs: clearcpuid=229

Flaky Workarounds:

Setting a System Profile of "Performance". This by itself was not enough to disable PLNs on our servers. Your mileage may vary.

Bad Workarounds:

Blacklisting ACPI related modules. I've seen this in a few forum threads. Ill-advised, so don't.

Best Answer

Related Solutions

Linux – Improving SAS multipath to JBOD performance on Linux

Dell Poweredge power management

Related Topic