Cron – Techniques to Monitor cron tasks

cronmonitoring

Are there good techniques for monitoring cron tasks over a cluster?

We're starting to use cron to launch tasks at daily intervals. A few ideas for checking out information:

Add special application handling that logs information into some "network aware" place, like a DB
Build up a logfile system that transfers the cron log periodically to a central point for processing/querying (along with other possible log files)

I'm wondering if people have had success with doing things separately for cron versus other things, or, if the tasks were integrated into a different approach completely. I'm leaning towards #2, but I'd like to know what more experienced folk might try out.

Best Answer

In addition to the other answers:

let the job write a timestamp to a file when it finishes along with the return value from the actual job
propagate the return value back to the original caller

We use the first to make it easier for Nagios (Icinga) to check, e.g if the last written timestamp is older than n hours (plus whatever logic you need) - we know something went wrong.

Related Solutions

What tool do you use to monitor your servers

I've used Nagios in the past with success. It's very extensible (over 200 add-ons), relatively easy to use and lots of reports. A negative would be the initial setup.

Cron – How to Prevent Duplicate Cron Jobs Running

There are a couple of programs that automate this feature, take away the annoyance and potential bugs from doing this yourself, and avoid the stale lock problem by using flock behind the scenes, too (which is a risk if you're just using touch). I've used lockrun and lckdo in the past, but now there's flock(1) (in newish versions of util-linux) which is great. It's really easy to use:

* * * * * /usr/bin/flock -n /tmp/fcj.lockfile /usr/local/bin/frequent_cron_job

Best Answer

Related Solutions

What tool do you use to monitor your servers

Cron – How to Prevent Duplicate Cron Jobs Running

Related Topic