Hi, I’m looking for a suggestion. Suppose I have two equivalent functions, say softmax versus the softmax with max subtracting trick (which is more numerical stable). How would you demonstrate (e.g., in a paper) that one version is more numerically stable than the other?

I guess you would need to check that usual range of each variable. And for each operation you perform measure the difference between the largest and smallest value. This will give you the precision of the op. Then you can try and compute the worst case scenario for your function.

I’m sure you can find some material on google from people studying these subjects for differential equation solvers.

Hi Known!

In the context of functions, I would prefer to use the terms “more

accurate” and “less accurate.”*

The simplest way to compare the accuracy of various versions of

your function is to compare them to the exact (or at least a known,

much more accurate) result.

As a practical matter (but maybe not fully satisfying to a purist),

you can compare your results with those calculated using greater

precision. So, if your two implementations use 32-bit floating-point

arithmetic (`dtype = float32`

), then you can also compute your

results using 64-bit arithmetic (`dtype = float64`

) and treat those

as the “exact” results. Your (32-bit) version that comes closest

to the 64-bit result is the more accurate.

If you have access to an extended-precision / arbitrary-precision

math library, you can take this approach further (or apply it to

your 64-bit versions, etc.).

I would be comfortable using a 64-bit comparison to demonstrate the

greater accuracy of an improved 32-bit algorithm (but maybe that’s

just good-enough-for-ranch-work me), and I would certainly accept a

comparison with the results of a series of increasingly more precise

extended-precision computations.

*) “Numerical stability” has a pretty well understood connotation in

a variety of contexts. But when being used precisely, it is most

commonly used in the context of iterative algorithms: Unstable

algorithms have iterations that amplifies errors, while stable algorithms

have iterations that reduce errors. So, unless you are using an

iterative algorithm to compute your function, I would prefer to speak

of accuracy rather than stability.

Good luck.

K. Frank

Thank you for the pointers!

Thank you! I think I’m actually looking at both, accuracy and also as an iterative algorithm that amplifies errors, as you mentioned here:

Is there a reference about performing many consecutive matrix multiplications amplifying errors?

BTW, I’m checking out Trefethen’s book “Numerical Linear Algebra”, Lecture 14 “Stability”.