Batch dependent results

isaacg · September 16, 2019, 10:01pm

Hi All,

I was wondering if someone knows if this same bug exists in pytorch?

Essentially, function outputs are dependent on batch size/data. For example, on a given input x, you get y=f(x). now add x to a batch of data z and send over the batch. You would exist for x’s index in z that y_z would equal y, but this is not the case.

ptrblck · September 16, 2019, 10:30pm

I have to admit it’s a strange question to ask for a specific bug.
However, if you set your batch norm to .eval() in PyTorch, the result will not be depending on the batch and the running estimates will be used for each sample.

Let us know, if you encounter any strange behavior.

isaacg · September 16, 2019, 10:45pm

I understand what you mean. The test example is very simple to implement and if pytorch has the same error, then it might be with cudnn.

ptrblck · September 16, 2019, 10:58pm

You could try to run this dummy example on your machine and check the difference:

x = torch.randn(10, 3, 24, 24)
bn = nn.BatchNorm2d(3)

# dummy forward passes to update running estimates
for _ in range(10):
    _ = bn(x)

# Set to eval
bn.eval()

output_all = bn(x)

output_manual = []
for i in range(x.size(0)):
    output_manual.append(bn(x[i:i+1]))
output_manual = torch.cat(output_manual)

print((output_all - output_manual).abs().max())
> tensor(0., grad_fn=<MaxBackward1>)

isaacg · September 27, 2019, 2:20pm

I know this is a strange request but i really want to run this issue to ground. Thanks for the code!