Electronic – Rx data rate on CAN bus faster than polling rate

canembeddedinterruptslpcmicrocontroller

I'm working on a safety-critical SIL 4 system and so interrupts have to be kept to a bare minimum (hence using only timer interrupts). CAN is used in polling mode.

Suppose data arrives at a CAN node and is not read from the buffer before another node sends its data, will the data be over written or corrupted or will the first node's data persist and the second node's data be lost?

Is it Ok to use more than one interrupt in a safety-critical SILĀ 4 system because the only way to service data in time without loss of data will be using the Rx interrupt for CAN or each node must be given a time slot to communicate on the bus (not good approach since CAN is multimaster communication protocol approach)?

Note: I'm using an LPC1778 microcontroller.

Best Answer

You should avoid interrupts when possible. Issues:

  1. They mess up predictable real-time behavior.
  2. Numerous interrupts may lead to unpredicted stack usage.
  3. It is easy to write very subtle and very severe bugs when sharing data between an ISR and the background program.

That being said, you can avoid these problems with careful system design.

1) is only a problem for non-cyclic interrupts that may arrive at any point in time. As long as the interrupts have deterministic behavior and are triggered cyclically, you can use them. In that case they are no different than a high priority process and you can still predict real-time behavior of the system.

2) can be avoided by reducing the number of interrupt sources as far as possible. Other safety measures is to always allocate a stack which is larger than necessary, and most importantly: place the stack so that upon overflow, it doesn't fail-cascade into other RAM memory segments like .bss or .data! Here is a good article about this.

3) is the hardest one to protect yourself from. Every variable shared between an ISR and the background program has to be handled with lots of care. Two issues exist: re-entrancy and compiler optimizer problems.

Re-entrancy has to be solved in case-to-case basis with atomic access/semaphores/mutex or by temporarily disabling interrupts. This is always tricky and you have to ensure that you have considered every scenario, and that the produced machine code actually does what you think.

The other issue is where your compiler does not realize that your ISR is called by the MCU rather than from your code, and therefore fails to understand that all variables used by the ISR can be updated at any point in time. The compiler may then optimize the background code incorrectly, since it assumes that a certain variable is never used. This bug can be avoided by always declaring variables shared with an ISR as volatile.

Both of these issues are common sources for very subtle, but often severe bugs. There's no standard way to protect yourself against them, the closest thing to a safety measure is to only allow your most hardened C veteran to write all ISRs. Intermediately experienced programmers, not to mention beginners, always write these bugs, over and over again.

Because of this, it is very hard to justify the use of interrupts in safety-critical applications. You would have to spend lots of time on the design, tests and documentation to verify that every such interrupt is not causing problems. Therefore I can understand why some safety standards bans the use of interrupts entirely.


As for the specific issue of CAN, it sounds a bit like you have either picked the wrong MCU for the task or you are not using the CAN controller correctly. More advanced CAN controllers have rx buffers which you can set up for dedicated messages, and in addition an rx FIFO where the rest of the messages go. I'm pretty sure NXP have such CAN controllers for their Cortex-M families, at least they do on LPC11C.

With such an approach and a carefully designed application-layer CAN protocol, you should not need rx interrupts. All safety-critical CAN networks must be designed to send messages over and over again periodically. If you know that a certain message only arrives once every 5ms, then you merely have to ensure that your background program is fast enough to handle it before the next message arrives.

For SIL4 you would likely have more than one CAN bus: you would dedicate one bus for safety-critical real-time messages and put everything else on another non-critical bus. Redundancy solutions with multiple CAN buses transmitting the same critical data are also used sometimes.