Electronic – timing constraint for bus synchronizer circuits

clockfpgasdctiming

I've a bus synchronizer circuit for passing a wide register across clock domains.

I'll provide a simplified description, omitting asynchronous reset logic.

The data is generated on one clock. Updates are many (at least a dozen) clock edges apart:

PROCESS (src_clk)
BEGIN
   IF RISING_EDGE(clock) THEN
      IF computation_done THEN
          data <= computation;
          ready_spin <= NOT ready_spin;
      END IF;
   END IF;
END PROCESS;

The control signal for fresh data, which is NRZI encoded (so a valid word on the bus corresponds to a transition on the control signal). The control signal passes through a DFF chain acting as a synchronizer.

PROCESS (dest_clk)
BEGIN
   IF RISING_EDGE(dest_clk) THEN
      ready_spin_q3 <= ready_spin_q2;
      ready_spin_q2 <= ready_spin_q1;
      ready_spin_q1 <= ready_spin;
   END IF;
END PROCESS;

The synchronizer circuit introduces a short delay, which provides plenty of time for the data bus to stabilize; the data bus is sampled directly without a risk of metastability:

PROCESS (dest_clk)
BEGIN
   IF RISING_EDGE(dest_clk) THEN
      IF ready_spin_q3 /= ready_spin_q2 THEN
         rx_data <= data;
      END IF;
   END IF;
END PROCESS;

This compiles, and works well when synthesized into a Cyclone II FPGA. However, TimeQuest reports setup and hold time violations, because it doesn't recognize the synchronizer. Worse, the Quartus manual says

Focus on improving the paths that show the worst slack. The Fitter works hardest on
paths with the worst slack. If you fix these paths, the Fitter might be able to improve
the other failing timing paths in the design.

So I want to add the right timing constraints to my project so that Quartus will spend its Fitter effort on other areas of the design.

I'm pretty sure that set_multicycle_path is the proper SDC (Synopsis Design Constraint) command, since the data lines will have multiple cycles of the destination clock to stabilize, but I can't find any complete examples using this command to describe clock domain crossing logic.

I'd really appreciate some guidance on writing SDC timing constraints for synchronizers. If you see a problem with this approach, please also let me know that.

Clock detail:

External clock generator: Two channels, refclk = 20 MHz, refclk2 = refclk/2 (10 MHz, and related).

Altera PLL: src_clk = refclk * 9/5 = 36 MHz

Altera PLL: dest_clk = refclk2 * 10 = 100 MHz

I also have data going the other direction, with 100 MHz src_clk and 36 MHz dest_clk.

TL;DR: What are the correct SDC timing constraints for the above code?

Best Answer

I don't have experience with Quartus, so treat this as general advice.

When working on paths between clock domains, timing tools expand the clocks to the least common multiple of their periods and select the closest pair of edges.

For paths from a 36 MHz clock (27.777 ns) to a 100 MHz clock (10 ns), if I did my quick calculations correctly, the closest pair of rising edges is 138.888 ns on the source clock and 140 ns on the destination clock. That's effectively a 900 MHz constraint for those paths! Depending on rounding (or for clocks with no relationship), it could come out worse than that.

There are at least three ways to write constraints for this structure. I am going to call the clocks fast_clk and slow_clk as I think that's clearer for illustration.

Option 1: disable timing with set_false_path

The easiest solution is to use set_false_path to disable timing between the clocks:

set_false_path -from [get_clocks fast_clk] -to [get_clocks slow_clk]
set_false_path -from [get_clocks slow_clk] -to [get_clocks fast_clk]

This is not strictly correct, since there are timing requirements for the synchronizer to work correctly. If the physical implementation delays the data too much relative to the control signal, then the synchronizer will not work. However, since there isn't any logic on the path, it's unlikely that the timing constraint will be violated. set_false_path is commonly used for this kind of structure, even in ASICs, where the effort vs. risk tradeoff for low-probability failures is more cautious than for FPGAs.

Option 2: relax the constraint with set_multicycle_path

You can allow additional time for certain paths with set_multicycle_path. It is more common to use multicycle paths with closely related clocks (e.g. interacting 1X and 2X clocks), but it will work here if the tool supports it sufficiently.

set_multicycle_path 2 -from [get_clocks slow_clk] -to [get_clocks fast_clk] -end -setup
set_multicycle_path 1 -from [get_clocks slow_clk] -to [get_clocks fast_clk] -end -hold

The default edge relationship for setup is single cycle, i.e. set_multicycle_path 1. These commands allow one more cycle of the endpoint clock (-end) for setup paths. The -hold adjustment with a number one less than the setup constraint is almost always needed when setting multi cycle paths, for more see below.

To constrain paths in the other direction similarly (relaxing the constraint by one period of the faster clock), change -end to -start:

set_multicycle_path 2 -from [get_clocks fast_clk] -to [get_clocks slow_clk] -start -setup
set_multicycle_path 1 -from [get_clocks fast_clk] -to [get_clocks slow_clk] -start -hold

Option 3: specify requirement directly with set_max_delay

This is similar to the effect of set_multicycle_path but saves having to think through the edge relationships and the effect on hold constraints.

set_max_delay 10 -from [get_clocks fast_clk] -to [get_clocks slow_clk]
set_max_delay 10 -from [get_clocks slow_clk] -to [get_clocks fast_clk]

You may want to pair this with set_min_delay for hold checks, or leave the default hold check in place. You may also be able to do set_false_path -hold to disable hold checks, if your tool supports it.

Gory details of edge selection for multi-cycle paths

To understand the hold adjustment that gets paired with each setup adjustment, consider this simple example with a 3:2 relationship. Each digit represents a rising clock edge:

1     2     3
4   5   6   7

The default setup check uses edges 2 and 6. The default hold check uses edges 1 and 4.

Applying a multi-cycle constraint of 2 with -end adjusts the default setup and hold checks to use the next edge after what they were originally using, meaning the setup check now uses edges 2 and 7 and the hold check uses edges 1 and 5. For two clocks at the same frequency, this adjustment makes sense — each data launch corresponds with one data capture, and if the capture edge is moved out by one, the hold check should also move out by one. This kind of constraint might make sense for two branches of a single clock if one of the branches has a large delay. However, for the situation here, a hold check using edges 1 and 5 isn't desirable, since the only way to fix it is to add an entire clock cycle of delay on the path.

The multi-cycle hold constraint of 1 (for hold, the default is 0) adjusts the edge of the destination clock uesd for hold checks backwards by one edge. The combination of 2-cycle setup MCP and 1-cycle hold MCP constraints will result in a setup check using edges 2 and 7, and a hold check using edges 1 and 4.

Related Solutions

Electronic – Does it always make sense to constrain an I/O port

Well, it does make sense to apply meaningful constraints if you actually care about timing and it does matter. How to constrain it heavily depends on your design. Thankfully, Altera has tons of examples for different cases.

But if you don't care at all then the best way to go is to mark that path as a false path so that Time Quest is happy and synthesizer does not hang for hours trying to route your design in order to meet timing requirements that you don't really have. That you can do with set_false_path command. For example:

set_false_path -from * -to [get_ports { output_port }]

(where output_port is a module's top level port assigned to a pin)

If Time Quest gives you a diagnostics that not every output port has a delay, you may want to add some dummy delay as well, like this:

set_output_delay -clock [get_clocks src_clk] 2 [get_ports { output_port }]

For a more practical example, you can check out this SDC file for this top-level module, the path to led_n is market as false path there since I pretty much don't care about timing from my logic to the LEDs.

Hope it helps.

Electronic – ASIC timing constraints via SDC: How to correctly specify a multiplexed clock

Define divide by 1 clocks on the and_* nets and declare them to be physically exclusive. Cadence RTL compiler handles the situation correctly by generating 3 timing paths for registers clocked by cpu_clk (one path each for one clock). Registers directly driven by clk0, clk4 and clk_ext have their own timing arcs.

create_generated_clock -source [get_ports clk0] \
-divide_by 1 -name and_clk0    [get_pins and_cpu_1/Y]

create_generated_clock -source [get_ports clk4] \
-divide_by 1 -name and_clk4    [get_pins and_cpu_2/Y]

create_generated_clock -source [get_ports clk_ext] \
-divide_by 1 -name and_clk_ext [get_pins and_cpu_ext1/Y]

set_clock_groups \
-physically_exclusive \
-group [get_clocks and_clk0] \
-group [get_clocks and_clk4] \
-group [get_clocks and_clk_ext]

Best Answer

Related Solutions

Electronic – Does it always make sense to constrain an I/O port

Electronic – ASIC timing constraints via SDC: How to correctly specify a multiplexed clock

Related Topic