I noticed a strange behavior of the nn.Conv2d layer.
When I pass the same data through the layer, the output slightly varies depending on the input batchsize. This happens ONLY when I run calculations on GPU.
Here is a code snippet to illustrate my words.
import torch import torch.nn as nn x100 = torch.randn(100, 64, 56, 56) x1 = x100.unsqueeze(0) my_conv = nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1, bias=False) # Let us firstly do calculations on a CPU x1_out = my_conv(x1) x100_out = my_conv(x100) print(torch.max(torch.abs(x1_out - x100_out))) # As expected the last command gives the zero tensor as a result: tensor(0., grad_fn=<MaxBackward1>) # Now, let us do the same on a GPU x1 = x1.to('cuda') x100 = x100.to('cuda') my_conv.to('cuda') x1_out = my_conv(x1) x100_out = my_conv(x100) print(torch.max(torch.abs(x1_out - x100_out))) # The last command gives a non-zero result: tensor(2.6226e-06, device='cuda:0', grad_fn=<MaxBackward1>)
I am probably missing something and there should be an easy explanation to this problem, but I cannot find one… Does anybody have an idea?
Thanks in advance.