I don't have experience with Quartus, so treat this as general advice.
When working on paths between clock domains, timing tools expand the clocks to the least common multiple of their periods and select the closest pair of edges.
For paths from a 36 MHz clock (27.777 ns) to a 100 MHz clock (10 ns), if I did my quick calculations correctly, the closest pair of rising edges is 138.888 ns on the source clock and 140 ns on the destination clock. That's effectively a 900 MHz constraint for those paths! Depending on rounding (or for clocks with no relationship), it could come out worse than that.
There are at least three ways to write constraints for this structure. I am going to call the clocks fast_clk
and slow_clk
as I think that's clearer for illustration.
Option 1: disable timing with set_false_path
The easiest solution is to use set_false_path
to disable timing between the clocks:
set_false_path -from [get_clocks fast_clk] -to [get_clocks slow_clk]
set_false_path -from [get_clocks slow_clk] -to [get_clocks fast_clk]
This is not strictly correct, since there are timing requirements for the synchronizer to work correctly. If the physical implementation delays the data too much relative to the control signal, then the synchronizer will not work. However, since there isn't any logic on the path, it's unlikely that the timing constraint will be violated. set_false_path
is commonly used for this kind of structure, even in ASICs, where the effort vs. risk tradeoff for low-probability failures is more cautious than for FPGAs.
Option 2: relax the constraint with set_multicycle_path
You can allow additional time for certain paths with set_multicycle_path
. It is more common to use multicycle paths with closely related clocks (e.g. interacting 1X and 2X clocks), but it will work here if the tool supports it sufficiently.
set_multicycle_path 2 -from [get_clocks slow_clk] -to [get_clocks fast_clk] -end -setup
set_multicycle_path 1 -from [get_clocks slow_clk] -to [get_clocks fast_clk] -end -hold
The default edge relationship for setup is single cycle, i.e. set_multicycle_path 1
. These commands allow one more cycle of the endpoint clock (-end
) for setup paths. The -hold
adjustment with a number one less than the setup constraint is almost always needed when setting multi cycle paths, for more see below.
To constrain paths in the other direction similarly (relaxing the constraint by one period of the faster clock), change -end
to -start
:
set_multicycle_path 2 -from [get_clocks fast_clk] -to [get_clocks slow_clk] -start -setup
set_multicycle_path 1 -from [get_clocks fast_clk] -to [get_clocks slow_clk] -start -hold
Option 3: specify requirement directly with set_max_delay
This is similar to the effect of set_multicycle_path
but saves having to think through the edge relationships and the effect on hold constraints.
set_max_delay 10 -from [get_clocks fast_clk] -to [get_clocks slow_clk]
set_max_delay 10 -from [get_clocks slow_clk] -to [get_clocks fast_clk]
You may want to pair this with set_min_delay
for hold checks, or leave the default hold check in place. You may also be able to do set_false_path -hold
to disable hold checks, if your tool supports it.
Gory details of edge selection for multi-cycle paths
To understand the hold adjustment that gets paired with each setup adjustment, consider this simple example with a 3:2 relationship. Each digit represents a rising clock edge:
1 2 3
4 5 6 7
The default setup check uses edges 2 and 6. The default hold check uses edges 1 and 4.
Applying a multi-cycle constraint of 2 with -end
adjusts the default setup and hold checks to use the next edge after what they were originally using, meaning the setup check now uses edges 2 and 7 and the hold check uses edges 1 and 5. For two clocks at the same frequency, this adjustment makes sense — each data launch corresponds with one data capture, and if the capture edge is moved out by one, the hold check should also move out by one. This kind of constraint might make sense for two branches of a single clock if one of the branches has a large delay. However, for the situation here, a hold check using edges 1 and 5 isn't desirable, since the only way to fix it is to add an entire clock cycle of delay on the path.
The multi-cycle hold constraint of 1 (for hold, the default is 0) adjusts the edge of the destination clock uesd for hold checks backwards by one edge. The combination of 2-cycle setup MCP and 1-cycle hold MCP constraints will result in a setup check using edges 2 and 7, and a hold check using edges 1 and 4.
Best Answer
There are certainly reasons why it is useful, but it really depends on the design.
For massively interconnected designs which don't have nice groupings (e.g. there are lots of processing cores which depend heavily on all the other cores, rather than each core operating independently), the synthesis tools can struggle to see the wood for the trees.
They try to bunch all of the logic as close together as possible for timing, but because the tools can't see how to group it into small sections, this actually can result in worse FMax as bits of cores get exploded around within other cores due to resource scarcity or routing conjestion.
By using LogicLock regions or equivalent, you can help the tools to see blocks which should be grouped together, and this can improve the timing performance as the tools can more tightly pack parts within the LogicLock regions.
If there are many clocks in a design, you can also LogicLock registers that belong to one clock into a specific region to try and reduce the number global clocks required. The synthesis tools are quite good at this nowadays, so probably not needed.
Another reason is if you have logic which is being pulled strongly in two directions (e.g. memory PHY in one corner, processor in the other corner, interconnect fabric in between). If one part was, say, running at a higher frequency than the other, then ideally any clock crossing would be closer to the high speed portion to cope with timing requirements, however if the logic is being pulled strongly in two directions it can be hard for the tools to optimise. There have been times where adding a LogicLock region for this sort of reason has taken designs I've worked on from failing timing to passing.
For more exotic use cases, such as Time to Digital conversion, you would use long carry chains to convert a pulse width into a multi-bit code. This technique typically requires precisely controlled and repeatable propagation delays, so constraining even to the exact register or LUT can be required.
I can't speak for Libero, but for Quartus unconstrained logic can still be placed within unused portions of the LogicLock region (unless you specifically disallow this). If you add debug logic like SignalTap it will be free to place it wherever it wants (unless you constrain SignalTap to a region), including adding the tap logic within the logiclocked region.
Finally you might want to save a regions of the FPGA for a specific future expansion, so might constrain the current design to a smaller portion of the FPGA so that you know you have the space you need later on.
Unless you have a reason to do so, its usually best to leave it up to the synthesis tools and not overconstrain the design to begin with.
If you start running in to issues with, say, timing analysis, then you could start to investigate if there are lots of long timing paths that appear to be due to high speed logic being widely distributed rather than packed tightly. The Chip Planner is quite useful as it in Quartus at least you can get it to show timing paths.
The fix might be to add more pipelining, or to start constraining logic to certain regions. Adding regional constraints can also allow you to pick apart complex designs to say, group high speed logic, and then see how that affects other paths from perhaps lower speed regions which could then point towards good places to add pipelining.