The key to good engineering is to keep it simple. In your case, you can use something very simple to turn on/off a relay to control a heater. A psuedo code:
int temperature;
int setpoint;
int status = 0;
int pulses = 0;
int integral = 4;
int derivative = 5;
int upperTemp = 1000; //100.0C
int upperLimit = 100; //pulses per time frame
int counter = 0;
//NO PULSING
runtime //make a timer that runs and updates temperature value
{
read temperature;
temperature *= 10; //still keeping in int for faster processing
status = ((setpoint - temperature)*integral)/derivative; //EX: (800-400)*4/5 = 320
//EX: (800-780)*4/5 = 16
status = abs(status); //get absolute value, this for for within range
//(800-850)*4/5 = -40 -> ABS(-40) = 40
if(status < 16) //from example, 16 means that value is within 2*C, you can always
Relay Off; //change this value to something else (e.g: (800-795)*4/5 = 4
else //4 gives 0.5*C tolerance range...which is a bit tight for temp.
Relay On; //anyway, play around with this value;
//you can also change integral and derivative if you wish.
}
//WITH PULSING
runtime //make a timer that runs and updates temperature value...lets say every second
{
read temperature; //every second
temperature *= 10; //multiply by 10 for decimal place, but still keeping in integer
pulses = map(temperature, upperTemp, 0, 0, upperLimit); //when temperature increases,
//pulses decreases...vice versa
//take this value of pulses to update the timer interval
//timer's interval = pulses per second or minute
}
timer
{
if(Relay On)
Relay Off;
else
Relay On;
}
//borrowed from arduino
long map(long x, long in_min, long in_max, long out_min, long out_max)
{
return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}
//PWM method: still need the runtime with pulsing from above. Let's say you get 75 pulses, this means that the relay will be on for 75% of the duration of each pulse and off for the other 25%. Let's say that you want to have a frequency of 1s (1000mS). Relay will be on for 750mS and off for 250mS. If the temperature is very far away from setpoint, then it might even be on for 100% of the time.
//choose entire duration by tick_interval*percentage: 100mS*100 = 10000mS or 10seconds. Back to example. If you have 75 pulses, then relay will be on for 7.5 seconds and off for 2.5 seconds.
timer //example 100mS per tick
{
if(counter > 100) //has reached the end of time period
counter = 0;
if(counter > pulses) //pulses from above is now the duty cycle
{
if(Relay is On)
Relay off;
}
else
{
if(Relay is Off)
Relay on;
}
counter++; //increment by 1
}
//this method is more flexible than using a fixed number (30 minutes) because different heater controllers respond differently
A colleague of mine benchmarked this and came to the conclusion that FPGAs would outperform a PC once you had more than about 100 independent, integer tasks that would fit in the FPGA. For floating point tasks GPGPU beat FPGA throughout. For narrow multithreading or SIMD operation then CPUs are extremely optimised and run at a higher clock speed than FPGAs typically achieve.
The other caveats: tasks must be independent. If there are data dependencies between tasks then that limits the critical path of computation. FPGAs are good for boolean evaluation and integer maths, as well as hardware low-latency interfaces, but not for memory-dependent workloads or floating point.
If you have to keep the workload in DRAM then that will be the bottleneck rather than the processor.
Best Answer
Sorting in an FPGA is typically done using a Sorting network. One good example of a sorting network is Bitonic Sort. A sorting network is a fixed network of comparators where the order of operations does not depend on the data. Bitonic sort has a complexity of O(n*log(n)^2), although it is not O(n*log(n)) like sorting algorithms popularly used in software implementations it is still often more efficient to implement in an FPGA due to its fixed structure.
For small arrays such as your 9-values you can just have a fixed sorting network with a throughput of 9 sorted values per clock cycle. If you have larger arrays or lower throughput requirements the sorting operation can be computed in different passes kind of like an FFT where a fixed k-input bitonic sort network is applied to the data several times. Typically k is chosen large enough to minimize the number of passes while keeping the data-path size feasible. The smallest bitonic sort network is a 2-input network where one output is the minimum value and the other is the maximum value.