Electronic – It is better to fix x’s in the simulation or in the design

digital-logicmicrocontrollerprogrammingsimulationverilog

I have a question about how to deal with x's in Verilog netlist simulations. I have a disagreement with another engineer (who is a bit more senior than I am) about what the right approach is. Although this question might seem opinion based, I do think there is a right answer even if the views are divided.

I do development for mixed-signal ASICs. I'm doing netlist simulations of a microprocessor which doesn't have it's program registers reset (it's an ARM Cortex-M3, there is an option to not have the registers reset during synthesis).

We have a ROM which the processor starts executing from after reset. During that program execution, one program register (R6) has x's in it because it wasn't reset. At a point in the simulation, x's from that register spread like wildfire through the rest of the design, and breaks the simulation. We don't see this issue in RTL simulations.

I would prefer to make a design change to cause the registers to be reset or to have the ROM program write zeros to those registers first thing. My college is very resistant to making any design change to clear out these x's, and he would prefer to mask them in the simulation somehow.

His contention is that the "x's aren't real because x's don't exist in real hardware". He therefore concludes that the x's in the simulation aren't real, and that any design changes based on these is not a good thing, or at lest, way too extreme of a response.

My contention is that although it is undoubtedly true that x's don't exist in hardware, they represent unknown or unpredictable values. I believe the gate library models x's to propagate pessimistically. Therefore, if x's are propagating to kill the simulation, it suggests there could be a combination of bits that would cause a problem. Since that is a possibility, I don't see making a design change to clear the x's as being too extreme, even if I can't prove they are absolutely real. (I suppose I could try a search for the bad combination of bits, but that would be a lot of work.)

Now, I can imagine an answer that the right approach depends on the type of quality that is being developed here. But, I think that the design changes I've suggested (adding in the resets or clearing them in the program) cost very little (especially modifying the ROM program). I think that going through the process of masking the bits in the simulation would be much more labor intensive.

What could I be misunderstanding from his point-of-view? What is the best approach? Is it really so bad to make a design change for a problem you can't prove is absolutely real?

Best Answer

I suppose it depends on where the x's are in the design.


Take an example communication scheme within the chip. You may want to pass data around between two components, but lets say not on every clock cycle. You might decide then to have a data bus and a valid signal. The valid signal says when the data is valid. Because of this, whenever the valid signal is low, the value of the data signal doesn't matter (because it is ignored).

In this example as long as the valid signal is never a don't-care, the design will never do anything unexpected. The data signal can be don't-care, it doesn't really matter, because you always know that there will be valid data (whatever it is) when the valid signal is high.

If the valid signal was ever don't-care, then you do have a problem. Why? because it means you have some scenario where you have no idea what will happen. This may or may not cause an issue, but boy is it a nightmare to track down.


So, with that in mind, it is my opinion that:

If you come across x's in a data bus, it isn't the end of the world, in reality you don't know what the data will be at any given time anyway.

If they appear in control signals it may or may not be bad, you don't know. In this situation you should run your simulation twice, once making sure that the don't care is forced to be 0, and a second time making sure it is a 1. This way you know what will happen in both cases.

If you cannot be certain (i.e. from the code you have written) that a don't care wont cause issues, then you should not ignore it, x is an unknown potential disaster. If you know that the value at any given point doesn't matter, then x is perfectly valid.



Also, all control registers, whatever they are for, should be initialised to a value at reset. An uninitialised control register at power on could be disastrous. Imagine you make a control system for launching nuclear weapons and didn't initialise the 'launch' register to 0. For all you know when you turn the power on you could start WW3.