I have a Pytorch model consisting of a convolution2d followed by BatchNorm2d and I am printing the output of each layer in the forward pass.
I cannot seem to understand the result of the output of BatchNorm based on the values of weight and bias it holds.
The following is the outputs as printed in Pytorch(conv output (which is also input to BatchNorm )and BatchNorm output):
tensor([[[[-0.0403, 0.0103, 0.0185],
[ 0.0240, 0.0535, 0.0137],
[ 0.0233, 0.0239, -0.0202]],
[[-0.1044, -0.1664, -0.2347],
[-0.1708, -0.2092, -0.2356],
[-0.2202, -0.2412, -0.2733]]]], grad_fn=<MkldnnConvolutionBackward>)
tensor([[[[-1.6799, -0.0496, 0.2127],
[ 0.3922, 1.3428, 0.0598],
[ 0.3674, 0.3883, -1.0339]],
[[ 0.4344, 0.1697, -0.1216],
[ 0.1510, -0.0127, -0.1253],
[-0.0596, -0.1495, -0.2863]]]], grad_fn=<NativeBatchNormBackward>)
the outputs were printed from the forward
function as:
x1 = self.conv1(x)
print(x1)
x2 = self.bn(x1)
print(x2)
Now when I print the weights and bias respectively of the BatchNorm layer it shows this:
Parameter containing:
tensor([0.8352, 0.2056], requires_grad=True)
Parameter containing:
tensor([0., 0.], requires_grad=True)
If BatchNorm is (weights*previoustensor + bias), then the first output value should have been (0.8352 * -0.0403) + 0 = -0.0336
but it shows -1.6799
Could someone please explain? I ask this as one of my colleagues pointed this out. In our internal code, our output is indeed -0.033 for the first index so we wanted to understand what was the value reasoning behind Pytorch or if there are other factors involved.