With the warmer summer days starting to hit, my server is increasingly going into thermal protection shutdown due to temp sensor #21 reaching critical level (58*C), as reported in ILO.
The fans are all running fine but unfortunately, the server is not in a climate controlled room and there isn't much I can do about the ambient temperature. However, if I know where sensor #21 is, I can try to improve the airflow around that area. And if it's an IC, I can add/beef up the heatsink to improve cooling.
Does anybody know where that sensor is?
Edit:
Levels at night time:
Location Status Reading Thresholds
Temp 1: System Zone Ok 15C Caution: 42C; Critical:47C
Temp 2 (CPU 1): System 1 Ok 40C Caution: 82C; Critical:83C
Temp 3 (CPU 2): System 2 Ok 40C Caution: 82C; Critical:83C
Temp 4: Memory Zone Ok 37C Caution: 87C; Critical:92C
Temp 5: Memory Zone Ok 41C Caution: 87C; Critical:92C
Temp 6: System Zone n/a n/a Caution: 99C; Critical:99C
Temp 7: System Zone n/a n/a Caution: 99C; Critical:99C
Temp 8 (MemB0): Memory Zone Ok 34C Caution: 62C; Critical:67C
Temp 9 (MemB0): Memory Zone Ok 34C Caution: 61C; Critical:66C
Temp 10 (MemB0): Memory Zone Ok 33C Caution: 61C; Critical:66C
Temp 12 (MemB1): Memory Zone Ok 38C Caution: 66C; Critical:71C
Temp 13 (MemB1): Memory Zone Ok 39C Caution: 65C; Critical:70C
Temp 14 (MemB1): Memory Zone Ok 38C Caution: 70C; Critical:75C
Temp 15: System Zone Ok 40C Caution: 57C; Critical:62C
Temp 16: System Zone Ok 33C Caution: 50C; Critical:55C
Temp 17: System Zone Ok 35C Caution: 58C; Critical:63C
Temp 18: System Zone Ok 45C Caution: 110C; Critical:115C
Temp 19: System Zone Ok 41C Caution: 57C; Critical:62C
Temp 20: System Zone Ok 42C Caution: 53C; Critical:58C
Temp 21: System Zone Ok 49C Caution: 53C; Critical:58C
Temp 22 (PCIR): I/O Board Zone n/a n/a Caution: 99C; Critical:99C
Temp 23 (PCIR): I/O Board Zone n/a n/a Caution: 99C; Critical:99C
Temp 24 (PCIR): I/O Board Zone n/a n/a Caution: 99C; Critical:99C
Temp 25 (PCIR): I/O Board Zone n/a n/a Caution: 99C; Critical:99C
Temp 26: Storage Zone Ok 0C Caution: 99C; Critical:99C
Temp 27: Storage Zone Ok 0C Caution: 99C; Critical:99C
Temp 28: Storage Zone Ok 0C Caution: 99C; Critical:99C
Temp 29: Storage Zone Ok 0C Caution: 99C; Critical:99C
Temp 30: Storage Zone Ok 0C Caution: 99C; Critical:99C
Temp 31: Storage Zone Ok 0C Caution: 99C; Critical:99C
Many thanks in advance.
Best Answer
Does it matter which sensor #21 is? What would you actually do about it?
Can you check your ambient temperature? What can you control about your environment to keep that within a reasonable range? Are you absolutely sure you don't have a failed fan?
--edit--
It makes sense to ensure the firmware of ALL of your components is up-to-date. For you, that means your system BIOS, ILO, RAID controller, NIC, backplane and disks. These can all be covered by the HP Support Pack for ProLiant bootable DVD. Please download and run.
Check the internal health LED on the server. If you're running a supported version of Linux or Windows, install the HP management agents and check the output of
hplog -t
to get temperature sensor information. The output of a standard DL180 G6 config looks like the following. Correlate your results with mine:note: there are four fans in this system