Convolution vs sum - floating point error

Hi all,

I am working with reimplementing a model in PyTorch and reusing pretrained weights. I have encountered errors which I thought are coming from floating point related problems.

I am providing a simple example which shows difference between convolution over input of ones vs summation of filter weights.

import torch
from torch.nn import functional as F

torch.backends.cudnn.deterministic = True 
torch.backends.cudnn.benchmark = False

torch.manual_seed(0)

inputs = torch.ones((1, 2, 2, 2))
filters = torch.randn((1, 2, 2, 2))

res1 = F.conv2d(inputs, filters).numpy()[0, 0, 0, 0]
res2 = filters.sum().numpy()

print(res1, res2, res1 - res2)

# -1.6045358 -1.6045356 -2.3841858e-07

Is it something expected or is the error too big for such a small operation?

The internal order of operations will most likely be different and e.g. a simple sum can also yield these floating point precision errors:

filters = torch.randn((1, 2, 2, 2))
res1 = filters.sum()
res2 = filters.sum(0).sum(0).sum(0).sum(0)
print(res1 - res2)
> tensor(2.3842e-07)