Electronic – FPGA maximum frequency : limiting factor

clock-speedfirfpgaintel-fpgaquartus

I would like to know which in general may limit the maximum clock frequency of a circuit implemented in FPGA.
In the specific case I am building some FIR filters using Quartus and simulating them on a FPGA of the Cyclone II family.

From my simulation it results that a II order FIR using a direct adders can be clocked at higher frequency than a II order FIR using transposed adders (420Mhz vs 387Mhz).
I did not expect this given that the critical path of the direct is bigger (2sum+1mult) than the one of the transposed (1s+1m).

Is this due to the fact that the direct has a more parallel architecture than the transposed and so the FPGA 'likes' this?
img1) direct
img2) transposed

direct

transposed

Best Answer

I suspect the difference is due to the negative coefficient in the 2nd case (according to the order of your diagrams).

Because your multiplying coefficients are all powers of 2, your multiplies can all be done by simple bit selects. For example, assuming you're doing 16-bit math, x*0.25 can be calculated as simply {2'b0, x[15:2]} (using Verilog notation).

This means your multiplications with positive coefficients are essentially free, and require no time at all.

Multiplying by a negative coefficient, however, means making a 2's-complement calculation, requiring inverting the bits and adding 1. That "adding 1" step implies a carry chain with delay equivalent to an adder of the same width.

So now you're effectively comparing two systems that both have a critical path equivalent to two adders, and it's down to luck which one happens to synthesize with less delay.

If you're using SystemVerilog or some other higher-level synthesis tool, the tool might even notice that one of the sums in the first version can be pipelined (calculated one clock cycle in advance) and thus reduce the critical path to a single adder.