Electronic – It is better to fix x’s in the simulation or in the design

digital-logicmicrocontrollerprogrammingsimulationverilog

I have a question about how to deal with x's in Verilog netlist simulations. I have a disagreement with another engineer (who is a bit more senior than I am) about what the right approach is. Although this question might seem opinion based, I do think there is a right answer even if the views are divided.

I do development for mixed-signal ASICs. I'm doing netlist simulations of a microprocessor which doesn't have it's program registers reset (it's an ARM Cortex-M3, there is an option to not have the registers reset during synthesis).

We have a ROM which the processor starts executing from after reset. During that program execution, one program register (R6) has x's in it because it wasn't reset. At a point in the simulation, x's from that register spread like wildfire through the rest of the design, and breaks the simulation. We don't see this issue in RTL simulations.

I would prefer to make a design change to cause the registers to be reset or to have the ROM program write zeros to those registers first thing. My college is very resistant to making any design change to clear out these x's, and he would prefer to mask them in the simulation somehow.

His contention is that the "x's aren't real because x's don't exist in real hardware". He therefore concludes that the x's in the simulation aren't real, and that any design changes based on these is not a good thing, or at lest, way too extreme of a response.

My contention is that although it is undoubtedly true that x's don't exist in hardware, they represent unknown or unpredictable values. I believe the gate library models x's to propagate pessimistically. Therefore, if x's are propagating to kill the simulation, it suggests there could be a combination of bits that would cause a problem. Since that is a possibility, I don't see making a design change to clear the x's as being too extreme, even if I can't prove they are absolutely real. (I suppose I could try a search for the bad combination of bits, but that would be a lot of work.)

Now, I can imagine an answer that the right approach depends on the type of quality that is being developed here. But, I think that the design changes I've suggested (adding in the resets or clearing them in the program) cost very little (especially modifying the ROM program). I think that going through the process of masking the bits in the simulation would be much more labor intensive.

What could I be misunderstanding from his point-of-view? What is the best approach? Is it really so bad to make a design change for a problem you can't prove is absolutely real?

Best Answer

I suppose it depends on where the x's are in the design.

Take an example communication scheme within the chip. You may want to pass data around between two components, but lets say not on every clock cycle. You might decide then to have a data bus and a valid signal. The valid signal says when the data is valid. Because of this, whenever the valid signal is low, the value of the data signal doesn't matter (because it is ignored).

In this example as long as the valid signal is never a don't-care, the design will never do anything unexpected. The data signal can be don't-care, it doesn't really matter, because you always know that there will be valid data (whatever it is) when the valid signal is high.

If the valid signal was ever don't-care, then you do have a problem. Why? because it means you have some scenario where you have no idea what will happen. This may or may not cause an issue, but boy is it a nightmare to track down.

So, with that in mind, it is my opinion that:

If you come across x's in a data bus, it isn't the end of the world, in reality you don't know what the data will be at any given time anyway.

If they appear in control signals it may or may not be bad, you don't know. In this situation you should run your simulation twice, once making sure that the don't care is forced to be 0, and a second time making sure it is a 1. This way you know what will happen in both cases.

If you cannot be certain (i.e. from the code you have written) that a don't care wont cause issues, then you should not ignore it, x is an unknown potential disaster. If you know that the value at any given point doesn't matter, then x is perfectly valid.

Also, all control registers, whatever they are for, should be initialised to a value at reset. An uninitialised control register at power on could be disastrous. Imagine you make a control system for launching nuclear weapons and didn't initialise the 'launch' register to 0. For all you know when you turn the power on you could start WW3.

Related Solutions

Electronic – Cycle counting with modern CPUs (e.g. ARM)

I vote for DMA. It's really flexible in Cortex-M3 and up - and you can do all kind of crazy things like automatically getting data from one place and outputing into another with specified rate or at some events without spending ANY CPU cycles. DMA is much more reliable.

But it might be quite hard to understand in details.

Another option is soft-cores on FPGA with hardware implementation of these tight things.

Electronic – the reason the PIC16 multitasking RTOS kernel doesn’t work

What you are trying to do is tricky, but very educational (if you are prepared to spend a lot of effort).

First, you must realise that this kind of PC-only (as opposed to PC+SP) task switching (which is the only thing you can do on a plain 12 or 14-bit PIC core) will only work when all the yield() statements in a task are in the same funtion: they can't be in a called function, and the compiler must not have messed with the function structure (as optimization might do).

currentTask->pch = PCLATH;\
currentTask->pcl = PCL + 8;\
asm("goto _taskswitcher");

You seem to assume that PCLATH is the upper bits of the program counter, as PCL is the lower bits. This is NOT the case. When you write to PCL the PCLATH bits are written to the PC, but the upper PC bits are never (implicitly) written to PCLATH. Re-read the relevant section of the datasheet.
Even if PCLATH was the upper bits of the PC, this would get you into trouble when the instruction after the goto is on not on the same 256-instruction 'page' as the first instruction.
the plain goto will not work when _taskswitcher is not in the current PCLATH page, you will need an LGOTO or equivalent.

A solution to your PCLATH problem is to declare a label after the goto, and write the lower and upper bits of that label to your pch and pcl locations. But I am not sure you can declare a 'local' label in inline assembly. You sure can in plain MPASM (Olin will smile).

Lastly, to this kind of context switching you must save and restore ALL context that the compiler might depend on, which might include

indirection register(s)
status flags
scratch memory locations
local variables that might overlap in memory because the compiler does not realise that your tasks must be independent
other things I can't imagine right now but the compiler author might use in the next version of the compiler (they tend to be very imaginative)

The PIC architecture is more problematic in this respect because a lot of resources are loacted all over the memory map, where more traditional architectures have them in registers or on the stack. As a consequence, PIC compilers often do not generate reentrant code, which is what you definitely need to do the things you want (again, Olin will probaly smile and assemble along.)

If you are into this for the joy of writng an task switcher I suggest that you swicth to a CPU that has a more traditional organization, like an ARM or Cortex. If you are stuck with your feet in a concrete plate of PICs, study existing PIC switchers (for instance salvo/pumkin?).

Best Answer

Related Solutions

Electronic – Cycle counting with modern CPUs (e.g. ARM)

Electronic – the reason the PIC16 multitasking RTOS kernel doesn’t work

Related Topic