Finding Critical Path of Combinational Logic

digital-logicsynthesisverilog

I have a combinational circuit and I would like to find its critical path in design compiler. Essentially, I want to find out by how much the combinational logic will reduce the maximum clock frequency of the larger sequential design.
For this purpose, I have added registers along the input of the combinational circuit (a simple multiplier in this case) which are clocked on the rising edge of a clock as advised in How to find the critical path delay of a big combinational block. I then run create_clock clk -period 5 -name clk and report_qor in DC, but I'm getting a Critical Path Length of 0.00 ns. This looks odd. If I move the multiplier directly to the test module, I get a more reasonable-looking Critical Path Length of 4.88 ns however.

module my_multiplier(
    output reg [31:0] out,
    input      [15:0] in1, in2,
    input             enable
);

always @(*) begin
   if (enable) begin
         out = in1 * in2;
   end
end

endmodule

I've created a separate module to instantiate the multiplier circuit and also clock the inputs to the multiplier:

module Test_multiplier_Tcrit(
    output  [31:0] out,
    input   [15:0] in1, in2,
    input          clk, enable
);

reg  [15:0] in1_reg, in2_reg;

my_multiplier my_multiplier(.out(out), .in1(in1_reg), .in2(in2_reg), .enable(enable)); 

always @(posedge clk) begin
   in1_reg <= in1; 
   in2_reg <= in2;
end

endmodule

Best Answer

Try putting a register on the output as well. Generally the timing analysis is done register-to-register, so without an output register it may not be able to give you a good answer.

Related Solutions

Electronic – LUT vs. hard IP based multipliers on Spartan-3 FPGA for constant coefficient multiplication

According to the datasheet the hard multiplier takes between 4 and 5 ns to propogate from inputs to outputs in combinational mode. You'll lose a few more 100s of ps getting to and from the multiplier to the rest of your logic. If that's fast enough, then just make use of it.

If not, build your LUT-based multiplier by just writing some code with the * operator in it, synthesise it, place and route, and see if that's fast enough. You may needs an attribute to force it to not use the hard multipliers (see the MULT_STYLE attribute in the XST manual). You could even try just forcing a single LUT-based (non-constant) multiplier with that constraint and see what the result is - that's a very quick test.

Only if those fail should you go down the route of hand-building a LUT-based structure - and even then only if you've looked at the output of the synthesiser and are pretty sure you can beat it for some reason. The synthesisers have been tuned to work out constant coefficient multipliers very well in my experience - I doubt coregen will gain much.

Wet finger estimate: A LUT delay is ~0.7ns. Assuming routing delays are of a similar magnitude, you can afford a chain of only 3-4 LUTs in the delay of the hard multiplier. It seems unlikely to me that you'll achieve what you need in that depth of logic.

Electronic – Verilog asynchronous reads of regs – and design question

Your code simulates two multiplexers. These are actually asynchronous components. The fact that Verilog requires data1_temp and data2_temp to be declared as reg's is a quirk of Verilog syntax and your choice of coding style, and doesn't mean these signals would be the outputs of storage elements in a physical implementation.

If you want to capture these values in actual registers, you need to add those explicitly:

reg [7:0] data1, data2;
always @(posedge someclock) begin
    data1 <= data1_tmp;
    data2 <= data2_tmp;
end

But I would like to know what this mini register file would be made of in hardware. Particularly, the 4x8 bit array consisting of k0,k1,k2,k3.

You haven't shown how these variables are assigned, so it's not possible to say how they are implemented. As your code showed, just declaring them as reg doesn't guarantee they are implemented with actual storage elements. If you assign them inside a block that begins always @(posedge clk) then very likely they are flip-flops, but there are ways you could code them that would make them synthesize differently.

I thought when it came to registers and arrays like this, you need a clock to read out data, like RAM?

You need a clock to update a (physical) register. You can read it out at any time. For example:

wire [8:0] sum;
assign sum = k0 + k1;

is perfectly valid code. sum will change whenever any of its inputs changes. If k0 and k1 are the outputs of flip-flops, their values will only change when there is a clock edge.

For another example, you could equally well describe your multiplexers with code like this:

reg [7:0] k0, k1, k2, k3;
wire [7:0] data1_tmp;
reg [1:0] reg1;
// k<n> and reg1 are assigned elsewhere.
assign data1_tmp = (reg1 == 0) ? k0 :
                   (reg1 == 1) ? k1 :
                   (reg1 == 2) ? k2 : k3;

how do I read from this tag_array and do the comparison all within the same clock cycle?

Let me repeat a key point for emphasis: You need to use a clock to assign a new value to a register (an actual hardware register or group of flip-flops). It's output is available at any time.

RAMs are different and how you access the contents of a RAM will depend on details of the type of RAM you use.

I got confused because frankly I don't know enough about memory hardware and how that's possible.

Another key strategy: When you are learning digital logic, I recommend you learn about the physical hardware first, and then work out or study how to simulate it in HDL second. So first, learn what a physical flip-flop is, then learn the standard Verilog methods of describing a flip-flop. Especially if you are trying to write HDL for synthesis, trying to write good code before you learn the capabilities of the underlying hardware will lead you down a lot of dead-end paths.

Best Answer

Related Solutions

Electronic – LUT vs. hard IP based multipliers on Spartan-3 FPGA for constant coefficient multiplication

Electronic – Verilog asynchronous reads of regs – and design question

Related Topic