Electronic – about the code for fir filter

fpgavhdl

Below is a 4 tap filter. That means the order of the filter is 4 and so it has 4 coefficients. the input is signed type of 8 bits wide. The output is also of signed type with 16 bits width. The design contains two files. One is the main file with all multiplications and adders defined in it, and another for defining the D flip flop operation.

The main file is given below:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity fir_4tap is
port(   Clk : in std_logic; --clock signal
        Xin : in signed(7 downto 0); --input signal
        Yout : out signed(15 downto 0)  --filter output
        );
end fir_4tap;

architecture Behavioral of fir_4tap is

component DFF is 
   port(
      Q : out signed(15 downto 0);      --output connected to the adder
      Clk :in std_logic;      -- Clock input
      D :in  signed(15 downto 0)      -- Data input from the MCM block.
   );
end component;  

signal H0,H1,H2,H3 : signed(7 downto 0) := (others => '0');
signal MCM0,MCM1,MCM2,MCM3,add_out1,add_out2,add_out3 : signed(15 downto 0) := (others => '0');
signal Q1,Q2,Q3 : signed(15 downto 0) := (others => '0');

begin

--filter coefficient initializations.
--H = [-2 -1 3 4].
H0 <= to_signed(-2,8);
H1 <= to_signed(-1,8);
H2 <= to_signed(3,8);
H3 <= to_signed(4,8);

--Multiple constant multiplications.
MCM3 <= H3*Xin;
MCM2 <= H2*Xin;
MCM1 <= H1*Xin;
MCM0 <= H0*Xin;

--adders
add_out1 <= Q1 + MCM2;
add_out2 <= Q2 + MCM1;
add_out3 <= Q3 + MCM0;

--flipflops(for introducing a delay).
dff1 : DFF port map(Q1,Clk,MCM3);
dff2 : DFF port map(Q2,Clk,add_out1);
dff3 : DFF port map(Q3,Clk,add_out2);

--an output produced at every positive edge of clock cycle.
process(Clk)
begin
    if(rising_edge(Clk)) then
        Yout <= add_out3;
    end if;
end process;

end Behavioral;

VHDL code for the component DFF is given below:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity DFF is 
   port(
      Q : out signed(15 downto 0);      --output connected to the adder
      Clk :in std_logic;      -- Clock input
      D :in  signed(15 downto 0)      -- Data input from the MCM block.
   );
end DFF;

architecture Behavioral of DFF is 

signal qt : signed(15 downto 0) := (others => '0');

begin 

Q <= qt;

process(Clk) 
begin 
  if ( rising_edge(Clk) ) then 
    qt <= D;
  end if;       
end process; 

end Behavioral;

I have written a small test bench code for testing the design. It contains 8 test inputs which are serially applied to the filter module. See below:

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;

ENTITY tb IS
END tb;

ARCHITECTURE behavior OF tb IS 

   signal Clk : std_logic := '0';
   signal Xin : signed(7 downto 0) := (others => '0');
   signal Yout : signed(15 downto 0) := (others => '0');
   constant Clk_period : time := 10 ns;

BEGIN

    -- Instantiate the Unit Under Test (UUT)
   uut: entity work.fir_4tap PORT MAP (
          Clk => Clk,
          Xin => Xin,
          Yout => Yout
        );

   -- Clock process definitions
   Clk_process :process
   begin
        Clk <= '0';
        wait for Clk_period/2;
        Clk <= '1';
        wait for Clk_period/2;
   end process;

   -- Stimulus process
   stim_proc: process
   begin        
      wait for Clk_period*2;
        Xin <= to_signed(-3,8); wait for clk_period*1;
        Xin <= to_signed(1,8); wait for clk_period*1;
        Xin <= to_signed(0,8); wait for clk_period*1;
        Xin <= to_signed(-2,8); wait for clk_period*1;
        Xin <= to_signed(-1,8); wait for clk_period*1;
        Xin <= to_signed(4,8); wait for clk_period*1;
        Xin <= to_signed(-5,8); wait for clk_period*1;
        Xin <= to_signed(6,8); wait for clk_period*1;
        Xin <= to_signed(0,8);

      wait;
   end process;

END;

The main question is the simulation result is coming good and whether this code will work on fpga virtex 4 device.I am using xilinx software and this software has ip core generators but i am not using them because i want to get good coding practice.

Best Answer

Add to your testbench a monitor process that watches the output and compares it to the expected value, allowing for the delays through the filter.

Your VHDL style is better than some - even better than some textbook examples I have seen, but still: a few comments :

Using named rather than positional assignment for the "dff" component instantiations would save possible confusion between inputs and outputs for anyone trying to follow the pipeline. This doesn't really matter here, because :
you can eliminate the dffs altogether; replace them with lines of the form Q1 <= MCM3; in the same clocked process as Yout <= add_out3; that greatly simplifies the whole filter.
You can reduce number of conversion functions by making H0..3 integer types; and if they are constant, make them constants! As multiplication between signed and integer is defined,no other changes are required.

type coefficient is new integer range -128 .. 127;
constant H0 : coefficient := -2;
Lose the redundant parentheses in if ( rising_edge(Clk) ) then - this isn't C!
The DRY principle applies in VHDL too... there are several ways to apply it to the testbench : my choice would be a local procedure.

   stim_proc: process

      procedure Input(D : in integer range -128 .. 127) is
      begin
         Xin <= to_signed(D,8); 
         wait for clk_period*1;
      end Input;

   begin        
      wait for Clk_period*2;
        Input(-3);
        Input( 1);

Related Solutions

Electronic – Code example for FIR/IIR filters in VHDL

It sounds like you need to figure out the DSP aspects first, then make an implementation in FPGA.

Sort out the DSP in C, Matlab, Excel, or anywhere else
Try and think how you'll transfer what you've learned from that into FPGA-land
Discover you've made some assumption about the implementation that doesn't work well (like the use of floating point for example)
Go back and update your offline DSP stuff to take account of this.
Iterate n times :)

Regarding data types, you can use integers just fine.

here's some sample code to get you going. Note that it's missing a lot of real-world issues (for example reset, overflow management) - but hopefully it's instructive:

library ieee;
use ieee.std_logic_1164.all;
entity simple_fir is
    generic (taps : integer_vector); 
    port (
        clk      : in  std_logic;
        sample   : in  integer;
        filtered : out integer := 0);
end entity simple_fir;
----------------------------------------------------------------------------------------------------------------------------------
architecture a1 of simple_fir is
begin  -- architecture a1
    process (clk) is
        variable delay_line : integer_vector(0 to taps'length-1) := (others => 0);
        variable sum : integer;
    begin  -- process
        if rising_edge(clk) then  -- rising clock edge
            delay_line := sample & delay_line(0 to taps'length-2);
            sum := 0;
            for i in 0 to taps'length-1 loop
                sum := sum + delay_line(i)*taps(taps'high-i);
            end loop;
            filtered <= sum;
        end if;
    end process;
end architecture a1;
----------------------------------------------------------------------------------------------------------------------------------
-- testbench
----------------------------------------------------------------------------------------------------------------------------------
library ieee;
use ieee.std_logic_1164.all;
entity tb_simple_fir is
end entity tb_simple_fir;
architecture test of tb_simple_fir is
    -- component generics
    constant lp_taps : integer_vector := ( 1, 1, 1, 1, 1);
    constant hp_taps : integer_vector := (-1, 0, 1);

    constant samples : integer_vector := (0,0,0,0,1,1,1,1,1);

    signal sample   : integer;
    signal filtered : integer;
    signal Clk : std_logic := '1';
    signal finished : std_logic;
begin  -- architecture test
    DUT: entity work.simple_fir
        generic map (taps => lp_taps)  -- try other taps in here
        port map (
            clk      => clk,
            sample   => sample,
            filtered => filtered);

    -- waveform generation
    WaveGen_Proc: process
    begin
        finished <= '0';
        for i in samples'range loop
            sample <= samples(i);
            wait until rising_edge(clk);
        end loop;
        -- allow pipeline to empty - input will stay constant
        for i in 0 to 5 loop
            wait until rising_edge(clk);
        end loop;
        finished <= '1';
        report (time'image(now) & " Finished");
        wait;
    end process WaveGen_Proc;

    -- clock generation
    Clk <= not Clk after 10 ns when finished /= '1' else '0';
end architecture test;

Electronic – Low Cost FPGA for 500MHz FIR Filter

Until recently, 500 MHz would have been considered a fairly fast clock, requiring a relatively high-end (and high-cost) FPGA. But nowadays a low-cost part ought to be able to do that.

However, there are other specs that are equally important to the data rate to determine what part will work for you:

What's the data width? A 16-bit adder requires a longer carry chain than an 8-bit adder and so requires a longer clock period in a given architecture and speed grade.
How many taps in the filter? A very large number means working with RAMs instead of just registers, leading to a new set of timing requirements and new considerations for which parts will meet your needs.
What are the weights? Equal weights on all taps means a much simpler calculation. If you have different weights on each tap, you might need to redo the complete set of add-multiplies for each new input sample, making for a much harder problem.

But if your other specs aside from clock rate are fairly relaxed you might be able to do this in a low cost device.

All the FPGA vendors have low-cost FPGA lines that can be priced as low as $5 each. For example, Xilinx has Spartan and Artix, Altera has Cyclone, etc. In recent generations, these parts should be able to do at least minimal logic at 500 MHz. But if you have to do wide add-multiplies or something, you may have to do some very careful pipelining or other tricks to get them to work. Be sure to look at the most recent generation of parts to get best performance, best pricing (unless a family is absolutely brand-new), and longest assurance of supply.
Recent CPLD's from Altera and Lattice are really small FPGAs with built-in flash to allow automatic reconfiguration on power-up. For a simple filter, these might be sufficient.

But without knowing your complete design we can't tell you what device will work. You'll have to just try designing it for each candidate part and use the vendor's synthesis tools to find out if you can meet timing in each case.

Best Answer

Related Solutions

Electronic – Code example for FIR/IIR filters in VHDL

Electronic – Low Cost FPGA for 500MHz FIR Filter

Related Topic