Electronic – FPGA double buffer strategy

bufferfpgavhdl

I am working on a FPGA project where a host CPU writes a 10,240 x 16-bit look up table into FPGA logic. To implement this, I've utilized on-chip memory to store the values and read them out when ready.

An external trigger/go pulse kicks off a processing cycle which lasts several hundred thousand clock cycles. Once we get this trigger, the state of the 10,240×16 LUT needs to be frozen or latched, so it can be utilized during the processing cycle. Unfortunately, the data needs to be available fairly soon after this "GO" pulse, so there is not enough time to do a complete buffer copy.

The host also needs to be able to continually update some values of the look up table while the current cycle is being executed, in order to setup for the next processing cycle. To allow for both cases (latching the state of the lookup table, but also letting the host update it whenever), I think that double-buffering ping/pong style is the way to go: The host writes to one buffer until we get to "GO" command, then the host writes to the other. The FPGA logic always reads out of the buffer not being written to.

However, since the host is not rewriting all 10,240×16 values when it does its sporadic updates, the buffer that is not being written to is essentially "dropping" the updates while it's frozen.

Is there a novel way to handle this scenario? I'm thinking there needs to be some kind of buffer resynchronization process once the buffer is unfrozen.

Best Answer

One possible strategy could be to use stale bits. Dunno if that's standard terminology, but it's similar to a dirty bit. Writing a new entry will clear the corresponding stale bit in the unlocked buffer and set the bit in the locked buffer. After switching buffers, have an internal copy routine transfer every entry marked stale in the unlocked buffer from the locked buffer to the unlocked buffer. In this way, new data written while the copy is in progress will not be overwritten, and all the old updates should be retained. The only thing you need to do is ensure that there is enough time for the copy operation to complete between buffer switches, or you need some sort of optimization to keep track of which entries are stale so you don't have to iterate over all of them, making the copy operation faster.

Another possible strategy could be to store the updated entries while a single buffer is locked, then apply only those updates when it is unlocked. If only a handful of entries are updated, then this might be more efficient. The updates could be stored as a linked list or similar data structure so that the list can be traversed efficiently while multiple updates to the same location can be coalesced.