Redhat – Can we list all possible reason for a cron job crashed or killed

aixcronredhat

We have some cron jobs which cannot be dead otherise the consequence would be davastating. So we need to have an eye on the process and know exactly what cause the death of a job. So I would like to ask you for any possible reason for a job being dead and what are the symptoms, what are the value returned and where was the system error log stored.

We are using two servers. One is "Red Hat Enterprise Linux AS release 3 (Taroon Update 2)", another is AIX 7.1. I wonder if there are any OOM killer on by default.

We are using user account only so we cannot view syslog such as /var/log.

The job could be shell script with Java program inside.

Best Answer

I wonder if there are any OOM killer on by default.

I don't know about the AIX implementation of things, but Linux of course has an out-of-memory-killer triggering in low-memory conditions. Additionally, you might see resource limits set via ulimit or similar facilities.

We have some cron jobs which cannot be dead otherise the consequence would be davastating.

This is broken by design. Errors occur, failures happen - you have to be able to deal with that.

what are the symptoms, what are the value returned

This depends entirely on the process you are running. It may or may not chose to return meaningful values to the operating system. As it happens, things are tricky with Java exceptions - the shell return code might be zero even after a stack trace so if you can't change the Java code, you should parse the output to catch the errors.