Yes, so I am getting to grips (and loving) zabbix, and have started the process of fine-tuning the alerts
.
I have this alert that is triggered on a linux server for having over 300 processes.
Now, this is sort of a central server that acts as a firewall and runs a bunch of stuff.. namely proxy/httpd-server/mysql/open-vpn/zabbix
Is there anything to look out for before I pop up the alert trigger to 350 processes?
The cpu load is still relatively quite low, I was thinking maybe one would check other stuff before upping the alerts.
Would I need to check if machine is bottle-necked elswhere, ie IO bound?
Any good advice for this or good documentation (hopefully well-written and easy to understand) as always would be greatly appreciated.
Best Answer
Like @sam said, it all depends on what the server is doing and how beefy is the server hardware. Running only handful of extremely CPU, memory and/or I/O intensive processes can easily overload even a powerful server. Especially if something makes your server swap, everything will be moving ahead slower than a snail or a turtle.
On the other hand, something like Postfix server can easily have the process count in hundreds, or thousands, as everything Postfix does is very light-weight.
In my opinion monitoring (or at least alerting because of) global process count is not useful. Though if you know for sure that there should not be more than X instances of some process around, then monitor that and raise an alert in an event there suddenly are more than X pieces of them around.
You can also graph amount of some processes for trends: for example, I tend to graph Cyrus IMAP/POP process count so I can see if they hover anywhere near current hard limits.
If you have some predictable process behaviours, you can use something like psmon for automatically restarting/killing (with optional logging / e-mailing for info about events psmon handled) misbehaving processes. Sure thing, Zabbix can be used for this, too, but psmon is very easy to configure for this kind of tasks.
What I would graph and monitor
In general, graph (and monitor) at least the following:
Then monitor the at least the following:
I hope this helps you. :)