What type of memory allows for most parallel read/write operations per clock cycle in an FPGA

ddr3fpgamemoryramsram

If you imagine basic motion detection where you have two frames stored in memory: a previous 640×480 frame and the current 640×480 frame, what type of memory (SRAM, DRAM, SDRAM, DDR SDRAM, etc) would allow for the most parallel read/write operations per clock cycle? For example, the perfect solution would allow the simultaneous read of two frames, (307,200 x 2 pixels) in one clock cycle. What is the best memory for this, the SRAM built into the FPGA, or an external chip like DDR SDRAM? (In the bigger picture I'd be looking for the a FPGA dev board that has the smallest cost and size with the largest parallel read/write ability.)

Best Answer

Dual port block RAM and LUT RAM is pretty much impossible to beat as it is on the FPGA die and accessing it does not require using any I/O pins. If you don't have enough capacity in block RAM, then you can throw external memory at the problem. QDR SRAM is dual ported and so has double the bandwidth of SDRAM, this can be useful for some applications, though it is very expensive. DDR3 SDRAM is probably the cheapest option, but even this is relatively slow compared to what you can do with LUT RAM and block RAM.

I think you're going about your design the wrong way. Besides, if you need to do any real processing on that many inputs, the logic alone would consume an absolutely gigantic amount of area, and may not fit on the largest FPGAs available today. What FPGAs excel at are very high speed pipelined processing operations. What you should do is figure out how many pixels per second you need to process (640x480xfps) and then figure out how to implement your image processing algorithm to get that level of performance. Generally the idea is to read in a small number of pixels per clock cycle and then process them in an orderly fashion.