I would like to know why there is a separate FPU peripheral in some high end micro-controllers even if the small end 8 bit controllers can perform float calculations on their own.

I tried experimenting with a float division on a pic 18 controller which does not have an FPU (Majority of 8 bit family does not have an FPU)and found out by debugging(ICD 3) that @48 mhz clock a simple division of float variable takes minimum 800+ clock cycles.

I don't have a controller with FPU on my side as of now to test with.But still I

would like to know what effective difference and edge does the addition of a FPU

make or provide .

## Best Answer

While FPU are typically slower than the main CPU, they are much faster in calculating with floating point because they use hardware implementation of the floating point operators. In order to better understand what happens, consider that a floating point operation consists of several steps.

For instance, an addition is roughly:

Note that these operations must be performed separately on a fixed-point CPU, and each may take more than one cycle to be computed.

If you design a dedicated unit that can in part parallelize these operations (e.g rounding and adjusting the exponential), you can save significant computation time.