Looking at the TI datasheet for 74LVC1G125, if you scroll down past the absolute maximum ratings, you'll get to the recommended operating conditions, including a maximum I_OH and I_OL. TI specs 32 mA max for either one, and that's only if you're providing 4.5 - 5 V Vcc. If you're using 3.3 V or lower, the recommended source and sink currents are lower. So I think one part of your question we can answer is, no, the 74LVC1G125 is not a good choice if you need to sink 50 - 100 mA.
To answer the rest of your question probably needs some more information: How fast does the buffer need to switch, what power supply voltages do you have available, is the load resistive, capacitive, or something else?
One option that you could probably get to work very generally is just to use some general-purpose npn transistor on each output. With an appropriate resistor between the CPLD output pin and the transistor base to limit the current, it should be straightforward to achieve 100 mA sink current and output voltage below 0.5 V (depending on the load).
A couple of approaches which may be useful for some styles of display is to divide the display panel into tiles, and
- restrict each tile to using a small set of colors, allowing the use of fewer than 8 bits per pixel, or
- use a byte or two from each tile to select a location from which to read bitmap data.
The first approach could reduce the rate at which data had to be read from display memory. For example, if one used tiles that were 16x16 and could each have four colors chosen from a set of 256, then without using any extra RAM in the FPGA one could reduce the number of memory reads per 16 pixels to eight (four color values, plus four bytes for the bitmap). If one added 160 bytes' worth of buffering/RAM(*) to the FPGA, one could reduce the number of memory reads per 16 pixels to four, using an extra 160 reads every 16 scan lines to read the next set of tile colors. If one wanted 16 colors per tile, the second approach would require an extra 640 bytes of RAM unless one placed some restrictions on the number of different palettes that could exist on a line.
The second approach would probably increase rather than reduce the total memory bandwidth required to produce a display, but would reduce the amount of memory that would have to be updated to change the display--one could change a byte or two to update an 8x8 or 16x16 area of the screen. Depending upon what you're trying to display, it may be helpful when using this style of approach to use one memory device to hold the tile shapes, and another to hold the tile selection. One might, for example, use a fast 32Kx8 RAM to hold a couple 80x60 tile maps with two bytes per tile. If the FPGA didn't have any buffering, it would have to read one byte every four pixels; even with a 40ns static RAM, that would leave plenty of time for the CPU to update the display (an entire screen would only be 9600 bytes). The memory bandwidth for reading out the tile shapes would be no better than it is now, but that part of memory wouldn't have to be updated.
Incidentally, if one didn't want to add a 32Kx8 RAM but could add add 320 bytes of buffering/RAM(**) to the FPGA, one could use a tile-map approach but have the CPU or DMA feed 160 bytes to the display every 8 scan lines. That would burden the controller somewhat even when nothing on the display was changing, but could simplify the circuitry.
(*) The buffer could be implemented as RAM, or as a sequence of 32 40-bit-long shift registers plus a little control logic.
(**) The buffer could be implemented as two 160-byte RAMs, or as two groups of sixteen 80-bit shift registers.
Best Answer
No, they are not the same. Both are 'double buffering' in that they involve more than one buffer. However, ping-pong is a specific type, and the phrase 'double buffering' is usually (though not always) reserved for the not ping-pong type.
In ping-pong buffering, there are two buffers, either of which can be used for output. While one provides output, the other can be filled asynchronously. The buffers are then switched over when required. The essence of ping-pong buffering is that the output goes back and forth between the two buffers, just like a ping-pong ball goes back and forth, between the halves of the table.
Typically ping-pong buffering is used for video memory, especially when the memory is shared with the system. We have a large amount of information, and already have I/O addressing support for a simple and rapid switch of address spaces.
In double buffering, there is a first buffer that always receives the input, then a second buffer dedicated to driving the output, and a signal to transfer from one to the other.
Examples of double buffering are found in the HC595 shift register, and the MAX534 quad DAC - for the ability to receive and store the programmed word without changing the actual output until later. We have a small amount of information, and easy to connect memory, aka registers.