Ubuntu 12.04 – Fix HP ProLiant DL380 G4 Load Maxing Out

hphp-prolianthp-smart-arrayubuntu-12.04

Trying to post this question on here. I've posted it on the Ubuntu forums as well with no replies.

Recently I upgraded an HP ProLiant DL380 G4 server from Ubuntu 10.04 server to Ubuntu 12.04 server.

Upon doing so, the server will not – at random times – get to a load of 400+ and then become completely non-responsive. I use an SNMP graphing program (cacti) and the load steadily increases by about 10 every five minutes until it gets over 400 and the graphing stops.

The graphs may not be accurate, but the CPU load averages about 3% before this happens – and right when the load starts increasing, it jumps to about 25% for 15 minutes and dramatically dips down to less than 1% (about 0.3%) until the graphing stops.

I'm not able to open a SSH tunnel to the server to do anything. I've checked the /var/log/syslog and all logging stops at that time as well – with nothing else in there.

The odd thing is – the server still responds to DNS queries for the zones it is authoritative on during this time – and at normal speed.

Just not sure what the next step would be in order to find out what is going on – and how this issue can be corrected. The server cannot stay with Ubuntu 10.04 Server and needs to stay upgraded.

Best Answer

This would be an I/O-related issue, as the disks and all write activity stops. The kernel and networking stack are running in RAM, thus the server is pingable.

The main things I would check are the system's BIOS/firmware, and the firmware revision of the Smart Array controller on the system. This is a an old ProLiant DL380 G4 (circa 2005), so you either have the onboard Smart Array 6i controller, a Smart Array 641 controller or a Smart Array 6400-series controller.

Can you tell us more?

The rapid load rise is due to processes being blocked waiting for I/O. You don't say what type of application is running on the system, but it seems like you probably have, say, 380+ processes waiting for disk :)

-- edit --

So, I deployed lots of those servers over the years. Do you have access to the firmware? Are you running the HP Management Agents? This will give you more insight into what you need here and get proper drivers in place.

And finally... this is reallllly old gear... Would you consider an upgrade?

See: HP Proliant DL380 G4 - Can this server still perform in 2011?

-- edit --

Try # modinfo cciss and post the result.

[root@MDMarra ~]# modinfo cciss
filename:       /lib/modules/2.6.32-279.14.1.el6.x86_64/kernel/drivers/block/cciss.ko
license:        GPL
version:        3.6.28
description:    Driver for HP Smart Array Controllers
author:         Hewlett-Packard Company
srcversion:     712C176F5D360D8C1166F22