Electronic – implement a math equation in FPGA, should I describe a CPU or can I do it just by code

arithmetic-divisionfpgaverilog

For a school project I'm trying to implement an equation for example like this:
(EDIT)

B = ((A + 2) * |A - 10|) / (c * c)

everything is unsigned binary values, absolute values always.
The equation should be evaluated 57600 times per second for an image of 240×240 pixels.

I don't know how to start it. Would I be better to implement it by making a MIPS processor and load a list of instructions of the program in assembly and so?

Or should I do a direct approach by code? If so, what methodology should I follow, should I do FSM? Should I use clocks?

I tried to program it by easy combinational (assign… etc) and it works, but it uses almost 80% of available ALMs. I don't think this is the best way, I'm looking to make it the less hardware usage possible, time is not a constraint. I'm using Quartus II and Verilog.

Best Answer

Depending on what you want to learn, there are many approaches possible.

You say the fully parallel combinatorial design works, and fits into your FPGA. Result! Many students would stop there and write it up. However, it sounds like you feel that this is not in the spirit of the project.

Creating your own processor design from scratch would be a project 100x the size of what you are attempting, for a general purpose core at least. Using an existing VHDL processor core would perhaps be too easy? Designing an ALU with just the instructions needed for these calculations still sounds quite a large detour.

The first place I would look to start serialising the design is that divide by c squared. Division is an operation that's very expensive or impossible to do as full width look up tables. Bit-wise shift-subtract is perhaps the mainstream way. Look up COORDIC as an alternative way of mechanising it. You may also want to consider byte or nybble-wide shift and subtract, as an alternative method of implementation, with a latency and resource use somewhere between the two previous methods.

Maybe you could to look at implementing serial arithmetic as an exercise, on the grounds of saving space. Hold the variables in shift registers, and cycle them through a one bit ALU+carry, LSB first. All sorts of interesting state machine issues to solve.