Hello!
I ran into an interesting inconsistency a few days ago. While preparing my environment to switch from torch 1.13 + cuda 11.7 to torch 2.1.2 + cuda 12.1, I double checked the outputs of a default resnet34 with IMAGENET1K_V1 weights in both PyTorch versions. On a torch.zeros(1, 3, 224, 224) as input, the exact same models produce slightly different values for all 1000 outputs. This is also the case for models that will be subdued to a PyTorch version switch in a production setting.
Now, I am wondering: Is this expected behaviour?