I side with OP's personal theory that it is not a normal practice to allow a computer program to proceed with a division-by-zero operation, or to only perform a minimal check before the division.
The exception is when you are implementing something that is too general - a programming language (such as MATLAB) where you (as the programmer) do not know the context / application / use-case / physical meaning of the mathematical operations it is asked to perform. This may be because the formula it is evaluating is provided by the customer, and you do not know the customer's use-case of that formula. In that case you use a special representation such as Inf or NaN as a placeholder.
If, however, the formula is provided as part of a statistical toolbox, then you should be able to provide an explanation when the situation arises. See the "weighted averaging when the total weight is zero" example below.
There is a way to "invert" a divisor underflow test. Mathematically if b
is not zero, and
abs(a) / abs(b) > abs(c)
where c
is the largest representable floating point value, then
abs(a) > abs(c) * abs(b)
. However, in practice it requires a more careful implementation than that.
You may be able to find a mathematical library function that allows you to pass in (a, b)
and it will return whether the division will overflow, underflow, or otherwise have poor precision.
Source code analyzers look for patterns in the code; they are not sophisticated enough to decide whether someone's workaround logic is sufficient for the application's design purpose. (In fact even the average programmer may be unqualified to make that decision.) Source code analyzers are supposed to be augmented with a person qualified to make that decision.
A denominator of zero can occur in a lot of mathematical manipulations: formulas, infinite series (summation of sequence), etc. There are many mathematical methods to calculate the result despite having denominators that approach zero (i.e. not exactly zero, but are smaller than the machine-representable value). These means the formulas are not to be evaluated verbatim - they are transformed using some calculus methods, and for each formula there may be several alternative versions which is chosen to avoid the division-by-zero issue.
Another situation arises in weighted averaging of data. If you perform a query that selects a subset of data, and when:
- the sum of weights for the subset of data turns out to be zero, or
- when the subset is indeed empty, i.e. the query returns no result
then the proper way to phrase that situation is "insufficient samples (data) for the query", etc.
In basic trigonometry, some representations (slope) are very sensitive to division problems, whereas an alternative representation (bearing, i.e. angle) would not be sensitive. For example, to represent a line on a 2D plane, where vertical and near-vertical lines need to be represented as robustly as horizontal and near-horizontal lines, you can:
- have a toggle between lines that are steep vs. those that are not. For lines steeper than 45 degrees, you would use
(x / y)
instead of (y / x)
as the "flipped" slope of the line, so as to avoid the division by small numbers.
- Use an alternative representation such as
a*x + b*y + c == 0
and store the parameters (a, b, c)
with the requirement that (a^2 + b^2)
must equal 1.0
for normal case, and 0.0
if the line is degenerate (not-a-line).
It is worth mentioning that degeneracy is unavoidable in many different contexts (and in context-specific ways). For example, if user passes in a "line" from point (x1, y1) to point (x2, y2)
and asks to calculate its slope, and it happens that (x1 == x2 and y1 == y2)
, then there is no slope, because there is no line, because there is only a single point in the user's input.
It's not going to hurt to build an actual computation that would fail. Division and multiplication can be used to reliably produce the kind of values you are looking for. However, once you've calculated interesting values to compare, what's the point of recalculating them each time? Do you want to unit test the calculations?
One thing to consider is that numbers like 0.2 cannot be represented exactly in floating point. Using something like that as your threshold value could potentially produce some unexpected results. Numbers such as 0.5, 0.25. 0.125 are exact in floating point. You might want to come up with unit tests check these situations where the threshold itself is a estimate.
Best Answer
I would always avoid successive floating-point operations unless the model I'm computing requires them. Floating-point arithmetic is unintuitive to most and a major source of errors. And telling the cases in which it causes errors from those where it doesn't is an even more subtle distinction!
Therefore, using floats as loop counters is a defect waiting to happen and would require at the very least a fat background comment explaining why it's okay to use 0.5 here, and that this depends on the specific numeric value. At that point, rewriting the code to avoid float counters will probably be the more readable option. And readability is next to correctness in the hierarchy of professional requirements.