I would like to know if you have an experience or any idea about how to set up nagios in big scale.
Previously we used nagios and nagiosql for manual settings, it was pretty comfortable for few servers.
Recently number of server has changed and manual configuration by nagiosql became uncomfortable. We use chef for starting new instances, I would like to know if there are good practices for using chef and nagios together. As one option, we can use just nagios and rewrite configuration files of nagios (based on server role) every time when we start new instance.
For example, the scenario could be like this, having started new mysql server, there is a dedicated recipe for rewriting nagios setting file. Recipe can get all data from chef data-bags about every server and build settings based on roles in chef.
Best Answer
I've implemented three slightly different solutions for Nagios monitoring using Chef over the last 18 months. They're all based around Chef's template resource for generating configuration files using the ERB syntax and that bit has worked really well. You have a Ruby array or hash of hosts and services, and Nagios configuration files are generated. It's pretty easy to test and debug.
nagios_hosts
and anagios_services
data bag and each host has a key that says which service checks get run, e.gcheck_load
,check_disk
. This setup is quick to get going and works reasonably well, although if hosts are deleted or new ones added someone has to be around to update the data bags. In practice it's easy to forget about this and things can get out of date which can lead to trouble.So after going through all of that I would probably recommend using something like option #2 for small (tens to hundreds) of nodes. I would try and keep it simple though. I used Chef's attribute system to define and override thresholds for the service checks based on roles and while it works, it's way too complicated and the cookbook has ended up becoming an unmaintainable mess.
Good luck!