Electrical – How to cross clock domains efficiently

digital-logicfpgaspartanverilogxilinx

I have a question regarding sending a short duration signal from a faster clock domain to a slower clock domain. I am trying to implement a dual frame buffer in a dual port (dual clock) RAM. Once an entire Frame has been stored, the write clock side asserts the FrameFull register. At the end of each current display frame, the display requests a new frame,
which if available, sets the read pointer for the RAM accordingly and is also used to deassert the FrameFull Signal so that a new frame can be loaded while this most recent one is being displayed.

The read side is operating at 50 MHz.
The write side is operating at 27 MHz.

In order to synchronize the FrameFull Signal from the write domain, I have read that its best to use a couple of synchronizing flip flops and since the FrameFull remains asserted until the buffer is switched, I think this isn't very problematic (because the FrameFull signal can't be missed).
The SwitchSuccesful signal is asserted in the read clock domain when a new Frame is requested and the FrameFull is 1, indicating buffer exchange. Now, this SwitchSuccesful needs to be sampled by the write domain so that FrameFull can be reset to 0 and can then start storing a new frame. I have thought about it and decided to use a 16-bit shift register, which, when buffers are switched, will reset to 16'hFF and then shift left with a zero being concatenated.I could then use a bitwise OR, and synchronize the result of the bitwise OR with Flip Flops in the write domain before sampling the signal. Will this be sufficient to prevent missing the SuccesfulSwitch and metastability?

CODE :

always @ (posedge read_clock)
begin
    if(SwitchRequest) begin
          case (Frame_FullSync1, Frame_Read) 
          2'b00 : begin rd_ptr <= /*Some Value*/ Frame_Read <= 0; end           //Restore to Previous 0.
          2'b01 : begin rd_ptr <= /*Some Value*/ Frame_Read <= 1; end           //Restore to Previous 1.
          2'b10 : begin rd_ptr <= /*Some Value*/ Frame_Read <= 1; LE <= 1; end  //Load new ----- 0 to 1.
          2'b11 : begin rd_ptr <= /*Some Value*/ Frame_Read <= 0; LE <= 1; end  //Load new ----- 1 to 0.
          endcase
    end
    if(LE) LE <= 0;
end

always @ (posedge read_clock)  begin 
    //Frame Full Synchronization from write domain to read domain
    Frame_FullSync1 <= Frame_FullSync0;
    Frame_FullSync0 <= FrameFull;
    //LE was asserted for 1 clock cycle when buffer switch was succesful
    if(LE) Sync <= 16'hFF;
    else   Sync <= Sync {Sync[15 : 1],1'b0};
end

assign SwitchSuccesful = |Sync;

always @ (posedge write_clock) begin
    //Synchronization of bitwise OR
    SwitchSuccesfulSync0 <= SwitchSuccesful;
    SwitchSuccesfulSync1 <= SwitchSuccesfulSync0; 
end

Best Answer

Assuming the data of interest changes on the falling edge of the 27Mhz clock and is sampled on the rising edge, the approach with minimal delay would be to have a divide-by-two clocked by the 27MHz clock whose output flips on the rising edge, feed the output of that into a double synchronizer which samples on the falling edge of the 50MHz clock, and capture everything else on the rising edge of the 50MHz clock.

If two consecutive falling edges of the 50Mhz clock report opposite states for the divide-by-two output, that implies that the real edge must have happened somewhere within that interval. The data will have been sampled at a time halfway between those 50MHz clock events, within about 10ns of when the 27MHz clock occurred.

Sampling everything on the same edge of the 50MHz clock may not be sufficient to allow reliable decoding of events. Consider the following two scenarios, with the first line representing 50MHz clock events and the latter representing 30MHz clock events (the same principles would apply at 27); both lines have a scale of 3.3ns per character

|-----|-----|-----|-----|-----|--  -- 50Mhz clock rising edges
-|---------|---------|---------|-  -- 30Mhz clock rising edges
_x---------x_________x---------x_  -- 30MHz/2, changing on rising edges
______x---------x_________x------  -- 30MHz/2, changing on falling edges
000000x111111111x222222222x333333  -- Data words, changing on 30Mhz fall

Consecutive samples of the divided 30Mhz clocks and data words would yield:

_-__--_--_  30MHz/2, changing on rising edges, then sampled @50Mhz rising
_x-__-x_--  30MHz/2, changing on falling edges, then sampled @50Mhz rising
0x1223x455  Data, sampled @50Mhz rising

There is no spot relative to observed edges on either form of the 30MHz clock where the output would be guaranteed to be clean. On the other hand, if the "30Mhz/2 rising" signal were sampled on falling edges of the 50Mhz clock, that would yield, relative to the data:

--_x-__-x_  30Mhz/2, changing on rising edges, sampled @50Mhz falling
0x1223x455  Data, sampled @50Mhz rising

If a 30Mhz clock coincided with a falling edge of the 50Mhz clock, there may be 20ns worth of uncertainty as to when the edge occurred, but the same data would be present on the cycles before and after, so such uncertainty wouldn't matter.