Are there good techniques for monitoring cron tasks over a cluster?
We're starting to use cron to launch tasks at daily intervals. A few ideas for checking out information:
- Add special application handling that logs information into some "network aware" place, like a DB
- Build up a logfile system that transfers the cron log periodically to a central point for processing/querying (along with other possible log files)
I'm wondering if people have had success with doing things separately for cron versus other things, or, if the tasks were integrated into a different approach completely. I'm leaning towards #2, but I'd like to know what more experienced folk might try out.
Best Answer
In addition to the other answers:
We use the first to make it easier for Nagios (Icinga) to check, e.g if the last written timestamp is older than n hours (plus whatever logic you need) - we know something went wrong.