Redhat – process works under strace but not normally

perlredhatstrace

I have a process – a perl script – that does:

while true
    check a POP account on a server on the lan
    process any email found
    write logs - messages found, actions taken, errors
    sleep for 15 seconds

It's running on a redhat 7.3 server (I inherited it, I'm not happy about the age of that box). It's run out of /etc/inittab like:

spop:2345:respawn:/usr/local/gw/bin/popdmn 

If it dies, init restarts it.

In the last couple of days, the process will no longer work unless it's straced. When it's just running, it never logs into the pop server. As soon as it's straced (via "strace -Ff -p cat /usr/local/gw/var/popdmn.pid"), it works flawlessly.

As a workaround, I'm running screen on the server with an strace running. Obviously this is less than ideal.

Why would a process do this? I haven't seen this happen before.

Best Answer

I think I've been bitten by an ancient strace bug:

https://bugzilla.redhat.com/show_bug.cgi?id=64303

https://bugzilla.redhat.com/show_bug.cgi?id=75709

This box has strace-4.4-4 on it, so it sounds possible that it's that bug. It sounds like this one is self-inflicted, as we were stracing while trying to debug - and made it worse.

kill -CONT works to resume the process.

Definitely time to upgrade this box.

Related Topic