Electronic – Understanding Skid Buffer Mechanism

bufferfpgaverilog

I have some questions about http://fpgacpu.ca/fpga/Pipeline_Skid_Buffer.html

1) Why is skid buffer designed to be 2-entries FIFO instead of just 1-entry FIFO ?

However, pipelining handshaking is more complicated: simply adding a
pipeline register to the valid, ready, and data lines will work, but
now each transfer take two cycles to start, and two cycles to stop.

2) Quoted from the article , why two cycles to start, and two cycles to stop ?

Best Answer

At least in one specific community (the one concerned with synchronous elastic systems), the capacity-2 buffer is called an elastic buffer (EB), and is NOT the simplest primitive. As you noted, capacity-1 buffer is simpler, and indeed it's this capacity-1 buffer that is called "skid buffer" in the synchronous elastic community (see, e.g., Automatic Pipelining of Elastic Systems, Fig. 2.5 and Fig. 2.7).

The capacity-2 buffer then is just a sequence of a capacity-1 buffer and a usual register (plus control; there are obviously two ways to sequence those two).

So why is the OP article using a capacity-2 buffer? Because the article is about pipelining, so by definition, all data must be registered on every possible transaction. Capacity-1 buffer hits register 0 or 1 times, so it's not always registered. Capacity-2 buffer hits register 1 or 2 times, so is useful for pipelining. To put it in a different way - one register is used for pipelining (to always shorten the length of combinational circuit) and the other register is used for "elasticity" (to not drop data on the floor when the downstream is stalled but upstream is still sending).

One important point is that when pipelining a path with a stalling protocol (ready/stall/stop, etc.), the stalling signal also has to be registered to avoid it limiting the clock rate. See SELF: Specification and design of synchronous elastic circuits for a short discussion of that (they propose using latches, but also acknowledge FFs may be needed in some cases, such as FPGAs). From this observation follows the need for an additional data register to preserve a data item in case the upstream was sending when downstream stalled (this data item follows the one currently in the "main" register, and the design has to ensure this sequence is preserved).