I'm designing a custom FPGA board, for something low-cost like a Xilinx Spartan 6. I want to perform research about solving memory-intensive algorithms in an FPGA.
As we all know, memory bandwidth is often a bottleneck, especially in low-cost solutions like Spartan 6. However a middle-end GPU has 150+ GB/s of memory bandwidth.
Is there any way to increase bandwidth in a low-cost FPGA to near-GPU levels?
I see only few ways:
- Connecting high-bandwidth memory like DDR4 to GPU chip and connecting all to the FPGA (kinda strange solution and I don't know if it's feasible and, if it is, won't the bandwidth between GPU and FPGA become a bottleneck?)
- Using multiple wide and fast memory interfaces to connect off-chip memory to FPGA
- Using custom controllers, connections or something else, optimized especially for this task to improve bandwidth
I care for at least 100 GB/s. On a low-cost Spartan 6 FPGA, this bandwidth would be success. Or it's impossible with this piece of hardware at all?
Best Answer
Impossible. Let's assume you're using a 512-bit memory interface, consisting of 8 memory modules in parallel. This would be very close to the maximum user I/Os on the largest available Spartan-6 part (540 user I/Os on the XC6SLX150T), and even then you might run over the limit with control signals. Even assuming this memory interface was possible, 100 GB/sec would require an I/O clock rate of ~1.5 GHz (assuming 100% efficient memory access!), which is unlikely to be attainable on a Spartan-6.
For memory-heavy applications, a GPU is often a surprisingly good solution.