Like @sam said, it all depends on what the server is doing and how beefy is the server hardware. Running only handful of extremely CPU, memory and/or I/O intensive processes can easily overload even a powerful server. Especially if something makes your server swap, everything will be moving ahead slower than a snail or a turtle.
On the other hand, something like Postfix server can easily have the process count in hundreds, or thousands, as everything Postfix does is very light-weight.
In my opinion monitoring (or at least alerting because of) global process count is not useful. Though if you know for sure that there should not be more than X instances of some process around, then monitor that and raise an alert in an event there suddenly are more than X pieces of them around.
You can also graph amount of some processes for trends: for example, I tend to graph Cyrus IMAP/POP process count so I can see if they hover anywhere near current hard limits.
If you have some predictable process behaviours, you can use something like psmon for automatically restarting/killing (with optional logging / e-mailing for info about events psmon handled) misbehaving processes. Sure thing, Zabbix can be used for this, too, but psmon is very easy to configure for this kind of tasks.
What I would graph and monitor
In general, graph (and monitor) at least the following:
- load average
- memory usage
- disk usage
- cpu usage
- amount of network traffic
- amount of some individual processes if you need to
- response times for your services
- server uptime (can be a very useful graph; if some server starts to misbehave and needs to be rebooted often, it's easy to spot from the graphs the moment problems started)
Then monitor the at least the following:
- are the processes that should be up responding correctly; in my opinion just testing if the port is up or if the process is present if not enough. Instead, if you want to check if web server is running, see if it returns HTTP 200 OK and preferably see if the test page contains some expected strings.
- server ping. If ping fails, alert immediately.
- kernel logs for severe things such as I/O errors, failed paths in SAN environment multipath configuration, kernel panics, OOM events, and so on
I hope this helps you. :)
Best Answer
You shouldn't allow root logins altogether, because that's insecure. You should only allow regular user logins, and once logged in, the user can use
sudo
to execute commands as root.That being said, why would you only check root logins? Checking regular user logins is just as important. Bots on the internet perform brute-force attacks for regular users all the time.
Either way, you need to check
/var/log/auth.log
for successful SSH logins. Checking log files requires active Zabbix agent checks. So first you need to make sure that active checks are working properly (see this blog post).Secondly, the Zabbix user is (by default) running as the
zabbix
user. So it will not be able to read/var/log/auth.log
, because that file is only readable by root and users in groupadm
. So you can add thezabbix
user to theadm
group. This allows Zabbix to read many log files (source).Finally, you need to create a monitoring item and trigger in Zabbix frontend.
Create an item:
Create a trigger for that item:
Notice the
Accepted .*
regular expression in the item. This should match all types of SSH authentication, be it password authentication, or public key authentication. Of course, in your case, you can change the regex to only match logins from root. But as explained earlier, this makes no sense from a security perspective.Also notice that I used
log[...]
, because this results in the responsible log line matched by the regex to be included in the notification email that Zabbix will send. That way, you can see in the email which user was authenticated.