Get power status over ipmi from zabbix

ipmisupermicrozabbix

I am now looking for a way how to monitoring a server hardware such as fans/power supplies/etc.. The problem is, we have very dynamic environment – servers are automatically powered on/off – even several times a day, depending on load.

I created tetmplates for our supermicro servers (we have just 3-4 types of them, so they are very specific) that contains fan speed check (0 means fan is dead). However, everytime I turn off the server fan speed is also 0.

So I am now searching how to get power status (or any other indicator that server is running) over ipmi to send a zabbix alert only if the server is running.

Over ipmi is unortunately the requirement, because we monitor this way some servers that we don't have an access to.

I'd like to avoid writing a script that will run something like: ipmitool power status. Zabbix has an amaizing ipmi integration, so I'd like to use it as much as possible.

ipmitool sensor returns:

root@virt1:~# ipmitool sensor
System Temp      | 28.000     | degrees C  | ok    | -9.000    | -7.000    | -5.000    | 75.000    | 77.000    | 79.000
CPU Temp         | 0x0        | discrete   | 0x0000| na        | na        | na        | na        | na        | na
FAN 1            | 8355.000   | RPM        | ok    | 400.000   | 585.000   | 770.000   | 29260.000 | 29815.000 | 30370.000
FAN 2            | 8355.000   | RPM        | ok    | 400.000   | 585.000   | 770.000   | 29260.000 | 29815.000 | 30370.000
FAN 3            | 8725.000   | RPM        | ok    | 400.000   | 585.000   | 770.000   | 29260.000 | 29815.000 | 30370.000
FAN 4            | na         | RPM        | na    | na        | na        | na        | na        | na        | na
CPU Vcore        | 1.144      | Volts      | ok    | 0.640     | 0.664     | 0.688     | 1.344     | 1.408     | 1.472
+3.3VCC          | 3.280      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
+12 V            | 12.031     | Volts      | ok    | 10.494    | 10.600    | 10.706    | 13.091    | 13.197    | 13.303
DIMM             | 1.544      | Volts      | ok    | 1.152     | 1.216     | 1.280     | 1.760     | 1.776     | 1.792
+5 V             | 5.216      | Volts      | ok    | 4.096     | 4.320     | 4.576     | 5.344     | 5.600     | 5.632
+5VSB            | 5.056      | Volts      | ok    | 4.096     | 4.320     | 4.576     | 5.344     | 5.600     | 5.632
VBAT             | 3.232      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
+3.3VSB          | 3.280      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
AVCC             | 3.280      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
Chassis Intru    | 0x0        | discrete   | 0x0000| na        | na        | na        | na        | na        | na
PS Status        | 0x1        | discrete   | 0x01ff| na        | na        | na        | na        | na        | na
root@virt1:~#

Best Answer

One idea could be querying power ON/OFF sensor. It is a discrete sensor, see https://www.zabbix.com/documentation/2.2/manual/config/items/itemtypes/ipmi there is an example how to analyze a state of discrete sensor.

If power ON/OFF sensor is not possible to monitor, you can read analog voltage sensors, for example, "+5V" (or few more voltage sensors). If voltage is near zero, the server is probably switched off (or power supply has failed).

Related Topic