I am trying to implement an FIR filter in Verilog. I have predetermined the coefficients in MATLAB. But I am not sure whether the registers will propagate properly with this code.
module fir_filter(
input clock,
input reset,
input wire[15:0] input_sample,
output reg[15:0] output_sample);
parameter N = 13;
reg signed[15:0] coeffs[12:0];
reg [15:0] holderBefore[12:0];
wire [15:0] toAdd[12:0];
always @(*)
begin
coeffs[0]=6375;
coeffs[1]=1;
coeffs[2]=-3656;
coeffs[3]=3;
coeffs[4]=4171;
coeffs[5]=4;
coeffs[6]=28404;
coeffs[7]=4;
coeffs[8]=4171;
coeffs[9]=3;
coeffs[10]=-3656;
coeffs[11]=1;
coeffs[12]=6375;
end
genvar i;
generate
for (i=0; i<N; i=i+1)
begin: mult
multiplier mult1(
.dataa(coeffs[i]),
.datab(holderBefore[i]),
.result(toAdd[i]));
end
endgenerate
always @(posedge clock or posedge reset)
begin
if(reset)
begin
holderBefore[12] <= 0;
holderBefore[11] <= 0;
holderBefore[10] <= 0;
holderBefore[9] <= 0;
holderBefore[8] <= 0;
holderBefore[7] <= 0;
holderBefore[6] <= 0;
holderBefore[5] <= 0;
holderBefore[4] <= 0;
holderBefore[3] <= 0;
holderBefore[2] <= 0;
holderBefore[1] <= 0;
holderBefore[0] <= 0;
output_sample <= 0;
end
else
begin
holderBefore[12] <= holderBefore[11];
holderBefore[11] <= holderBefore[10];
holderBefore[10] <= holderBefore[9];
holderBefore[9] <= holderBefore[8];
holderBefore[8] <= holderBefore[7];
holderBefore[7] <= holderBefore[6];
holderBefore[6] <= holderBefore[5];
holderBefore[5] <= holderBefore[4];
holderBefore[4] <= holderBefore[3];
holderBefore[3] <= holderBefore[2];
holderBefore[2] <= holderBefore[1];
holderBefore[1] <= holderBefore[0];
holderBefore[0] <= input_sample;
output_sample <= (input_sample + toAdd[0] + toAdd[1] +
toAdd[2] + toAdd[3] + toAdd[4] + toAdd[5] +
toAdd[6] + toAdd[7] + toAdd[8] + toAdd[9] +
toAdd[10] + toAdd[11] + toAdd[12]);
end
end
endmodule
Is this the best way to implement this? is there a better way to do the addition?
Any help is greatly appreciated!
Also resources that would help are also greatly appreciated.
Best Answer
Area and power efficient FIR/IIR filters are the holy grail for some.
Using generate statements you have instantiated 13 multipliers. Multipliers take up quite a lot of area. It is common to only instantiate one and time multiplex it (TDM). In this case supply a clock (tick) 13 times faster than the required output rate.
Your adder chain while looking valid again is going to be very big and could lead to timing problems as there could be very long ripple chains. Breaking this down over multiple cycles might result in lower area and power.
If you combine the multiplication of a sample with the addition you will have a more typical MAC architecture (Multiply Accumulate).
I would also avoid initialising constants in an
always @*
as no right hand sides of arguments change this may not trigger the sensitivity list.For these I would use
localparams
, or if going down the TDM route I would create a Look up table (LUT).