I was wondering if someone knows if this same bug exists in pytorch?
Essentially, function outputs are dependent on batch size/data. For example, on a given input x, you get y=f(x). now add x to a batch of data z and send over the batch. You would exist for x’s index in z that y_z would equal y, but this is not the case.
I have to admit it’s a strange question to ask for a specific bug.
However, if you set your batch norm to .eval() in PyTorch, the result will not be depending on the batch and the running estimates will be used for each sample.
Let us know, if you encounter any strange behavior.