Numerical instability with CUDA

My network outputs slightly different results when I forward the same input in different batches. This problem occurs on my GTX 1080 Ti. On the CPU, results seem to be stable, independent of the batch size.
Here is a minimal code example to reproduce:

import torch
import torch.nn as nn
import numpy as np

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, 1, 1)
        self.conv2 = nn.Conv2d(32, 32, 3, 2, 1)

    def forward(self, tensor):
        x = self.conv1(tensor)
        x = self.conv2(x)
        return x

model = Model()

in_tensor_1 = torch.randn([1, 3, 32, 32]).cuda()
out_1 = model(in_tensor_1).cpu().data.numpy()[0]

in_tensor_2 = torch.empty([16, 3, 32, 32]).cuda()
for i in range(16):
    in_tensor_2[i] = in_tensor_1
out_2 = model(in_tensor_2).cpu().data.numpy()[0]

print(np.array_equal(out_1, out_2))



I think you want to check the reproducibility section of the doc :slight_smile:

torch.backends.cudnn.deterministic = True did the trick. I had no idea, simple convolutional layers behave non-deterministically in cudnn.
Well, you never stop learning :wink: