Solutions for floating point rounding errors

floating pointnumeric precision

In building an application that deals with a lot of mathematical calculations, I have encountered the problem that certain numbers cause rounding errors.

While I understand that floating point is not exact, the problem is how do I deal with exact numbers to make sure that when calculations are preformed on them floating point rounding doesn't cause any issues?

Best Answer

There are three fundamental approaches to creating alternative numeric types that are free of floating point rounding. The common theme with these is that they use integer math instead in various ways.

Rationals

Represent the number as a whole part and rational number with a numerator and a denominator. The number 15.589 would be represented as w: 15; n: 589; d:1000.

When added to 0.25 (which is w: 0; n: 1; d: 4), this involves calculating the LCM, and then adding the two numbers. This works well for many situations, though can result in very large numbers when you are working with many rational numbers that are relatively prime to each other.

Fixed point

You have the whole part, and the decimal part. All numbers are rounded (there's that word - but you know where it is) to that precision. For example, you could have fixed point with 3 decimal points. 15.589 + 0.250 becomes adding 589 + 250 % 1000 for the decimal part (and then any carry to the whole part). This works very nicely with existing databases. As mentioned, there is rounding but you know where it is and can specify it such that it is more precise than is needed (you are only measuring to 3 decimal points, so make it fixed 4).

Floating fixed point

Store a value and the precision. 15.589 is stored as 15589 for the value and 3 for the precision, while 0.25 is stored as 25 and 2. This can handle arbitrary precision. I believe this is what the internals of Java's BigDecimal uses (haven't looked at it recently) uses. At some point, you will want to get it back out of this format and display it - and that may involve rounding (again, you control where it is).


Once you determine the choice for the representation, you can either find existing third party libraries that use this, or write your own. When writing your own, be sure to unit test it and make sure you are doing the math correctly.