Electronic – Why does this simple VHDL pattern for a shift register not work as expected

activehdlmodelsimsimulationvhdl

At first glance you would expect the VHDL source code below to behave as a shift register. In that q, over time would be

"UUUU0", "UUU00", "UU000", "U0000", "00000", ....

but instead it is always U after five (or more) consecutive clock cycles.

Why is this?

This code is actually a much simplified version of a far more complicated simulation. But it demonstrates the symptoms that I see.

It exhibits this interesting and unexpected result during simulation under both ModelSim and ActiveHDL, I have not tried other simulators and would (secondly to an explanation of the cause) like to know if others act in the same way.

To answer this question properly you must understand that:

  • I know this is not the best way of implementing a shift register
  • I know for RTL synthesis this should have a reset.
  • I know an array of std_logic is a std_logic_vector.
  • I know of the aggregation operator, &.

What I have also found:

  • If the assignment temp(0)<='0'; is moved inside the process, it works.
  • If the loop is unwrapped (see commented code), it works.

I will reiterate that this is a very simplified version of a much more complicated design (for a pipelined CPU), configured to purely show the unexpected simulation results. The actual signal types are just a simplification. For this reason you must consider your answers with the code in the form as it is.

My guess is that the VHDL simulation engine's optimiser is mistakenly (or perhaps as per specification) not bothering to run the expressions inside the loop as no signals outside change, though I can disprove this by placing the unwrapped loop in a loop.

So I expect that the answer to this question is more to do with the standards for VHDL simulation of inexplicit VHDL syntax and how VHDL simulation engines do their optimisations, rather than if given code example is the best way of doing something or not.

And now to the code I am simulating:

 library ieee;
 use ieee.std_logic_1164.all;   

 entity test_simple is
    port (
        clk : in  std_logic;
        q   : out std_logic
    );                   
 end entity;

 architecture example of test_simple is
    type   t_temp is array(4 downto 0) of std_logic;
    signal temp : t_temp;
 begin

    temp(0) <= '0';

    p : process (clk)
    begin               
        if rising_edge(clk) then
            for i in 1 to 4 loop
                    temp(i) <= temp(i - 1);
            end loop;

            --temp(1) <= temp(0);   
            --temp(2) <= temp(1);
            --temp(3) <= temp(2);
            --temp(4) <= temp(3);
        end if;
    end process p;
    q <= temp(4);
 end architecture;

And the test bench:

library ieee;
use ieee.std_logic_1164.all;

entity Bench is
end entity;

architecture tb of bench is

component test_simple is
    port (
        clk : in  std_logic;
        q   : out std_logic
    );                   
end component;

signal clk:std_logic:='0';
signal q:std_logic;     
signal rst:std_logic;

constant freq:real:=100.0e3;

begin                       
    clk<=not clk after 0.5 sec / freq;

    TB:process
    begin
        rst<='1';
        wait for 10 us;
        rst<='0';
        wait for 100 us;
        wait;
    end process;

     --Note: rst is not connected
    UUT:test_simple  port map (clk=>clk,q=>q) ;
end architecture;

Best Answer

It has to do with what can be easily evaluated at elaboration time, formally, what is called a "locally static expression". This is an obscure looking rule, but it deserves some thought - eventually it does make some sense, and your simulator is quite correct in alerting you by generating non-obvious results.

Now, temp(1) can be evaluated at compile time (even earlier than elaboration time) and it can generate a driver on bit 1 of "temp".

However, temp(i) involves a bit more work for the tools. Given the trivial nature of the loop bounds here ( 1 to 4 ) it is obvious to us humans that temp(0) cannot be driven and what you are doing is safe. But imagine the bounds were functions lower(foo) to upper(bar) in a package declared somewhere else... now the most you can say with certainty is that temp is driven - so the "locally static" expression is temp.

And that means that the process is constrained by these rules to drive all of temp, at which point you have multiple drivers on temp(0) - the process driving (no initial value, i.e. 'u') and the external temp(0) <= '0';. So naturally the two drivers resolve to 'U'.

The alternative would be a "hacky little rule" (opinion) that if the loop bounds were constants, do one thing, but if they were declared as something else, do something else, and so on ... the more such oddball little rules there are, the more complex the language becomes... in my opinion, not a better solution.

Related Topic