Nagios – measuring Average CPU Load

cpu-usagenagios

I've been looking for some hours now for a plugin that will notify me if one of my server's CPU Load has been over 90% for the past 5 hours.
No luck looking around the Nagios Exchange.

Can anyone help out?

Thanks!

Best Answer

CPU load under UNIX is typically defined as the number of processes in a runnable state. We measure this in 1, 5, and 15 minute intervals. The command uptime is a common way to output the load average values.

~$ uptime 18:35:22 up 1 min, 1 user, load average: 0.04, 0.01, 0.01

check_load takes a tuple of three elements, matching the 1, 5, and 15 minute averages and accepts both a warning and critical threshold.

As a rough idea, try check_load -c 0.9,0.9,0.9 with a check_interval of 1 hour and a max_check_attempts of 5.

Also note, the -r argument. This addresses the fact that most CPUs are multi-core and can therefore be fully utilized individually while still having excess capacity in the aggregate.

Related Topic