How to identify unstable floating point computations

numbersnumeric precision

In numerics, it is very important to be able to identify unstable
schemes and to improve their stability. How to identify unstable
floating point computations?

I am working on a very complex simulation where many numerical schemes work together and I am looking for a methodic to identify its weak parts. I am working on a physical model involving differential equations. A bird's eye view of the overall process is:

  1. (Preliminary step) Gather physical observations P.

  2. Determine initial parameters of the simulation. This uses an optimisation algorithm, where we walk in a parameter space and look for parameters C such that some error function E(F(C), P) is minimised, where F is some derived quantity of the parameters.

  3. Plug C in the simulation engine. This is an Euler scheme of the EDP, so that at each time step, we compute the terms driving the dynamic (each of them is a complex function, potentially subject to instability) and feed the Euler scheme with these dynamic terms to compute the next state. This goes on for thousands time points.

  4. At the end of the simulation, we compute some function Proof(S) of the final state S and compare to some quantities Require(P) deduced from the observed quantities. This is not a formal proof of the result, more a plausibility check.

Also, I see a tower of complex operations (computation of dynamic terms, within the Euler scheme, within the Proof). And would like to recognise “bad parts” and fix them.

I speculate that using a software implementation of floating point
numbers with reduced precision would magnify the unstability of
numerical schemes, thus easing the comparison between different
implementations. Is this a common technique to investigate this
question? Is it possible to use a virtual machine, as Bochs, to
achieve this without altering the program?

To deal appropriately with the stability question, it is sometimes
acceptable to target the typical input of the numerical procedure, so
that it can be tuned to do well on that input and maybe less well on
other valid, yet unlikely, input. Given a sample of typical inputs,
it is possible to snoop some intermediate results and prepare a
statistical profile for them. Again, is this a common technique to
study stability issues? Is a virtual machine useful for this?

Best Answer

The study of stability of floating point computation is part of numerical analysis and if you really want a sound result, you really want someone knowledgeable in that domain to do the analysis of the algorithms used.

There are some things which can help to experimentally identify unstable algorithms. Running with rounding set to different modes (up/down/random) or with different precision and checking that the result don't vary too much. Answering is this too much? isn't simple at all and even when the answer is no that doesn't mean the algorithm is stable, just that it wasn't detected unstable on the data set you used.

Interval arithmetic has been proposed in comments. When I looked at it, even the most rabid proponent of interval arithmetic admitted that it worked well with algorithms designed for interval arithmetic but that switching to it without analyzing the algorithm and ensuring that it hadn't patterns known not to work well wouldn't be useful (opponents seemed of the opinion that the pre-conditions for interval arithmetic to be useful where too restrictive to be of practical interest)