Electronic – Why don’t on board communications like I2C, SPI, etc. have error checking in general

error correctioni2cspi

Some error checking methods like parity check, checksum, CRC, etc. are used for wired/wireless communications. However, most of the ICs with interfaces like I2C, SPI, etc. don't use an error checking method.

Let's search for "i2c i/o expander" and open a random datasheet. For example, let's consider PCF8574 from TI which is a 8-bit I/O expander. If a bit corresponding to output register is flipped during I2C transmission, the IC will drive the corresponding pin to an undesired level. Why most of these kind of ICs don't have any error checking mechanism at all? My assumption is that even if communication is between ICs, all signals are noisy. Although the probability will be quite low, noise can cause a bit flip.

May this be the reason?: None of error checking mechanism gurantees a completely error free communication. They can only help us to reduce the probability of error. It is obvious that probability of bit error for long range communication is higher than on board communication. Maybe bit error probability for on board communication is in a acceptable range even without any error checking mechanism.

What do you think?

Best Answer

You have to assume certain things just work, even in a world with error checking. Why pick on IIC or SPI when there are usually many more digital signals on a board? You seem to be OK with assuming those will all be interpreted as intended.

A properly designed circuit on a properly designed board should be reliable. Think of a CMOS output driving a CMOS input across a board. Other than outright component failure (which is a whole different problem from occasional data corruption), think about what can actually go wrong. At the driving end, you've got a FET with some maximum guaranteed on resistance connecting a line to either Vdd or ground. What exactly to you imagine can cause that not have the right level at the receiving end?

Initially the state can be undetermined as whatever capacitance on the line is charged or discharged. Then there can be ringing in the short trace. However, we can calculate maximum worst case times for all this to settle and the line to be reliably across some threshold at the other end.

Once this time has been reached and we've waited for whatever the worst case propagation delay of the logic is, there is little to change the signal. You may be thinking noise from other parts of the board can couple onto the signal. Yes, that can happen, but we can also design for that. The amount of noise in another part of the board is generally known. If not, then it's coming from elsewhere and in proper design it would be clamped to be limited to some maximum dV/dt and other characteristics. These things can all be designed for.

External noise can in theory upset traces on a board, but the field strength would need to be unreasonably large for a properly designed board. High noise environments do exist, but are limited to known locations. A board may not work 10 meters from a 10 kW transmitter, but even that can be designed for.

So the answer is basically that digital signals on the same board, if designed properly, can be considered absolutely reliable for most ordinary uses. In special cases where the cost of failure is very high, like space and some military applications, other strategies are used. These usually include redundant subsystems. You still consider individual signals on a board reliable, but assume boards or subsystems as a whole may occasionally err. Note also that these systems cost much more, and such a cost burden would make most ordinary systems, like personal computers for example, useless by being too expensive.

That all said, there are cases where even in ordinary consumer electronics error detection and correction is employed. This is usually because the process itself has a certain error probability and because limits are being pushed. High speed main memory for computers often do include extra bits for error detection and/or correction. It's cheaper to get the performance and ultimate error rate by pushing limits and adding resources to error correction than to slow things down and use more silicon to make everything inherently more reliable.