Weird inconsistency during evaluation of linear layer

ferrophile · February 5, 2022, 6:41pm

Consider a single linear layer as follows.
x and w are vectors of the same size, and will be the input / weight of this layer.

x = torch.rand(1, 63)
w = torch.rand(1, 63)
fc = Linear(63, 1, bias=False)

Suppose I evaluate the following block:

permute = np.random.permutation(63)
fc.load_state_dict({'weight': w}, strict=False)
val1 = fc(x)[0, 0].item()
fc.load_state_dict({'weight': w[:, permute]}, strict=False)
val2 = fc(x[:, permute])[0, 0].item()

print(val1)
print(val2)

Note that val2 is computed by applying the same permutation to the elements of x and w. Theoretically, val1 and val2 should have the same value, i.e. the dot product of x and w.

In practice, when I execute the block on Jupyter for multiple times with different permutations, val1 and val2 occasionally result in slightly different numbers, such as 14.554329872131348 vs 14.554328918457031.

(Pytorch ver: 1.10.0. Issue persists regardless of using CPU / GPU as device)

I would like to know why this situation occurs. Is it due to some sort of numerical stability issue, or are there some randomized factors that dictate how these layers are evaluated?

KFrank · February 6, 2022, 12:09am

Hi Hong Wing!

This is due to floating-point round-off error. Repeat your experiment
using double precision and your “slightly different numbers” will agree
to about twice as many decimal places as they do in the single-precision
example you posted.

Best.

K. Frank