Process monitoring in Linux environment

clinux-developmentmultithreading

I'm trying to write a multi threaded/processes application and it need to know how to monitor a process from another process all the time.
So here is what I have, I have a 2 processes, each with multiple threads that handle the network part, then another 2 process also with multiple threads that interact with DB and with the network processes, what I need to do is that if for example one of the network processes goes down the DB process start sending to the live network process until the second one is up again.
I'm using fifo between the DB and the network process.

I was thinking of sending messages with message passing all the time but not sure whether this is a good idea or I need to use some other IPC for this issue, or probably neither is good and I need to use entirely something else?

Best Answer

I would say that you should not depend on this for routing the messages. I'd suggest having a single message queue that all network threads would pick messages from, so if one dies, all unprocessed messages eventually get picked up by the other (instead of some between the thread dying and the watchdog detecting it being send to black hole). The one during processing of which it crashed will still be lost, so you still need some resilience, but the disruption will be smaller.

Than you'd just have a watchdog to make sure there is always required number of threads running. The threads would send a message to the watchdog every n seconds and if no message is received from some thread for 3 or 4 times n, the watchdog would kill it and start new one.