Sensu alternative (?) where alarm thresholds defined on server (not monitored client)

monitoringrabbitmq

Question/TLDR;

Is there a Sensu -alternative (i.e operating system monitoring agent/server based on RabbitMQ) that defines its alarm thresholds on the central monitoring server and not on the monitored client server (as Sensu and Nagios do)?

RabbitMQ is required so no Zabbix et al, I'm afraid.

Background:

I have a large environments (Windows and RHEL) where I can't install orchestration tools (Puppet et al) and the amount of installed services should therefore be kept to a minimum.

I'm researching if I could develop a single agent that collects system information, relays logs (to Logstash) and reports on resource consumption.
It would push all these values in to RabbitMQ and then Logstash could subscribe to the logs, a monitoring service could subscribe to the resource metrics (and create alarms from them), a CMDB system could subscribe to the system information etc.

However, I would want to just receive the information about resource consumption and create the alarms on the monitoring server and not have to change the thresholds on each server to change the alarm threshold.

I can't be the only person to find an agent like that useful…

Clarification:

If a server under Sensu monitoring is running out of disk, the Sensu agent checks the disk space, compares it against the CRITICAL threshold that's defined on that server and if the threshold is passed, a CRITICAL alarm is sent through RabbitMQ to the central monitoring server.
To change the threshold without Puppet or somesuch, logging in to the server is required (right?)

The way I'd like this to work is that when a monitoring agent checks its disk space, it just sends the amount of available disk (or used disk and total etc) through RabbitMQ to the central server which then compares that value against the threshold defined on the central server and, if necessary, sends an alarm.

If the threshold needs to be changed, it's changed on the central server or multiple values from multiple servers can be compared to create an alarm.

This is kinda my main issue with Sensu, although I understand the decision to go with Nagios compatibility.

It would also be preferable if no central server -> monitored server traffic would be required. I suppose a kludge could be made where the central server sends the thresholds to the agent which then runs them as "local". The network for the environment makes this exceptionally tricky.

Thanks for any ideas anyone might have!

Best Answer

Using open source components, I'd use the following components (if you indeed do need to send metrics via RabbitMQ):

  1. use collectd on the client side to send metrics into RabbitMQ with its AMQP plugin
  2. consume the messages from RabbitMQ using graphite-amqp-tools and send them into Graphite

Now you have the metrics in Graphite, you can query it for your resource consumption. In my $WORK's environment, we have checks which query Graphite, with the alerting thresholds set on Nagios server. But now that you have Graphite (is has a http interface for querying which can return graphs, json, csv & plain text results) you could build/use anything as long as it can query Graphite.