Electronic – Feasibility Question – graphic acceleration on CPLD / FPGA …DSP

dspfpgamathprogrammable-logic

I am a programmer new to electronics. I wanted to get a perspective on wether Programmable logic is feasible in allowing a basic math algorithm to be accelerated.

Wanting to solve a ray-intersection algorithm (few multiplications and subtractions ) over a grid of 800×600 (480000) numbers. [I understand that integers would be ideal, and control logic is frowned upon – these constraints I can work around]

Everything I have researched so far says that I can offload this processing from a processor onto a FPGA could be programmed that would calculate the problem space in a very efficient manner.

few questions:

I am thinking of using a CPLD – perhaps literally an Altera MAX® 10 – does this provide the right scale device to get this type of problem set complete ?

If I wanted to calculate the problem set 100/sec would that be possible?

(assuming performance is a problem) Can the problem be easily divided amongst different chips – where each chip wrote it's solution to a different region of ram?

Is this sort of project feasible to attach to a lightning bold/ usb3 / sata or pci express slot in a PC ?
– what type of investment does it take to create a board that sophisticated (timing-wise)?

Could this be done with DSP chips? (I'm having a hard time understanding where DSPs are no longer useable – I understand there typical use in filters etc … but how much more applicable can they be ? Can they perform simple integer math operations?)

Could this be done with discreet logic chips? – I was looking at a this EE project http://www.ele.uri.edu/~vijay/ProjectELE447.pdf … which described all manner of ALU and multiplier chips – could these simply be piped together to express this algorithm efficiently?

How would my needs change if I needed to solve a grid of 4000 x 2000?

Thanks for your time

Best Answer

This is, presumably, for ray tracing acceleration?

See also Can FPGA out perform a multi-core PC?

A colleague of mine benchmarked this and came to the conclusion that FPGAs would outperform a PC once you had more than about 100 independent, integer tasks that would fit in the FPGA. For floating point tasks GPGPU beat FPGA throughout. For narrow multithreading or SIMD operation then CPUs are extremely optimised and run at a higher clock speed than FPGAs typically achieve.

The MAX10 is a range of FPGAs of varying sizes. They certainly are capable of multiplication: up to 144 different 18x18 calculations per cycle. You need to be careful with FPGA designs not to end up limited by the speed of your DRAM. It's also a fairly substantial project to learn to program one from scratch and the tools are kind of frustrating.

Can the problem be easily divided amongst different chips - well, probably, this kind of tile-based solution is not unusual. Partitioning it is for you to work out or a software question. Remember that coordination between devices is much slower than within devices.

PCIe: I was involved in a project that built some multichannel fast ADC boards with FPGAs mounted in PCIe cards. We had half a dozen soldered by hand by experts, resulting in them costing about $1000 each. The layout took a few weeks; it was done by a graduate student with an experienced engineer looking over his shoulder regularly.

DSPs are built to do integer maths, especially multiply-accumulate (a=b*c+d). See What is the difference between a DSP and a standard microcontroller? They're generally offering similar functionality to SIMD instructions; Intel offer a multiply-accumulate since Haswell.

Could this be done with discrete logic chips?

The paper you linked looks like someone's thesis project of doing actual silicon design - taking that out of the simulator and building a chip from it is again a multiple-thousand dollar operation, and very rarely worth it.

The other approach of assembling chips of varying functions on a PCB has not been sensibly fast since the early 80s.

Related Topic