How can I show that one function is more numerically stable than another equivalent function?

Hi, I’m looking for a suggestion. Suppose I have two equivalent functions, say softmax versus the softmax with max subtracting trick (which is more numerical stable). How would you demonstrate (e.g., in a paper) that one version is more numerically stable than the other?

I guess you would need to check that usual range of each variable. And for each operation you perform measure the difference between the largest and smallest value. This will give you the precision of the op. Then you can try and compute the worst case scenario for your function.

I’m sure you can find some material on google from people studying these subjects for differential equation solvers.

Hi Known!

In the context of functions, I would prefer to use the terms “more
accurate” and “less accurate.”*

The simplest way to compare the accuracy of various versions of
your function is to compare them to the exact (or at least a known,
much more accurate) result.

As a practical matter (but maybe not fully satisfying to a purist),
you can compare your results with those calculated using greater
precision. So, if your two implementations use 32-bit floating-point
arithmetic (dtype = float32), then you can also compute your
results using 64-bit arithmetic (dtype = float64) and treat those
as the “exact” results. Your (32-bit) version that comes closest
to the 64-bit result is the more accurate.

If you have access to an extended-precision / arbitrary-precision
math library, you can take this approach further (or apply it to
your 64-bit versions, etc.).

I would be comfortable using a 64-bit comparison to demonstrate the
greater accuracy of an improved 32-bit algorithm (but maybe that’s
just good-enough-for-ranch-work me), and I would certainly accept a
comparison with the results of a series of increasingly more precise
extended-precision computations.

*) “Numerical stability” has a pretty well understood connotation in
a variety of contexts. But when being used precisely, it is most
commonly used in the context of iterative algorithms: Unstable
algorithms have iterations that amplifies errors, while stable algorithms
have iterations that reduce errors. So, unless you are using an
iterative algorithm to compute your function, I would prefer to speak
of accuracy rather than stability.

Good luck.

K. Frank

Thank you for the pointers!

Thank you! I think I’m actually looking at both, accuracy and also as an iterative algorithm that amplifies errors, as you mentioned here:

Is there a reference about performing many consecutive matrix multiplications amplifying errors?

BTW, I’m checking out Trefethen’s book “Numerical Linear Algebra”, Lecture 14 “Stability”.