How to formulate the batchnorm calculation into a linear transformation?

ptrblck · January 26, 2018, 12:34pm

@Sun-xiaohui The author of the original question wanted to use some interval BatchNorm values to manipulate the data (see first post).
Since BatchNorm is already applied in the forward pass, I don’t really know, what happens if you apply it another time in the last manipulation. What is your use case?

Sun-xiaohui · January 26, 2018, 1:31pm

The way I am sorry that I ask questions, making you feel confused. I want to ask question is like this, to rewrite the MyBatchNorm method, the original author want to get y = kx + h expression, calculated by the corresponding k has, but in return value x.data formula corresponding to the beta is inconsistent with the h, I don’t know this, and did you noticed. And yes, you asked, in the final again call again, I’m not familiar with pytorch this, you mean in the forward x = self. bn (x) operation, has been dealt with, the input x into batchnorm, batchnorm return values assigned to x, x.data if calculated in the process of a batchnorm formula, it is equivalent to x twice the batch operation. I don’t know if I’ll torture to understand whether it is right, if that’s the case, that also can remove the x.d ata operation to, but if can remove, when you write MyBatchnorm write x.data = k * x.data + beta What is the role Well. At the same time return the value of x will participate in the activation function of the network operation , hope you can give the answer. I wonder if you can understand my answer.

ptrblck · January 26, 2018, 1:51pm

Don’t worry! We are here to help

Well, maybe it was a typo, but he asked for y=kx+b:

Maybe you are right and it should be y=kx+h.
@LJ_Mason Could you confirm @Sun-xiaohui 's remark?

Yes, x = self.bn(x) applies the batch norm on x, so that x_out = (x - mean[x])/sqrt(Var[x] + eps) * gamma + beta. I don’t know what kind of normalization one would get applying k and h after the BatchNorm. Maybe @LJ_Mason could give some feedback on this.

Do you want to backpropagate through k and h somehow? gamma and beta are already being updated in the BatchNorm layer (self.bn).

I hope I could help!

Sun-xiaohui · January 26, 2018, 2:33pm

Yes, that gave me some help.Thanks for your answer. I still have some questions about whether I can omit the last x.data operation in the forward, and the purpose of this operation is to return to the question that the original author asked for a linear function. On our asking, you not is say this may cause to enter the secondary batch norm problem, and the return of x will participate in the network after batchnorm in operation, if involved in operation, if the value of x and pytorch itself provide batchnorm method of processing the input data after after the type of the return value is the same? Still have even if I can make the x.data operation to remove it, it will affect the training of the network at the same time? Thanks.

3a08078858321b0c74cc · July 24, 2018, 5:18am

@ptrblck
I also try to emulate the BatchNorm2d manually. I first extract a BatchNorm2d from a trained network and manually extract the parameter from this module to conduct my own batch norm function. the code is below

bn=copy.deepcopy(net.bn)
bn.track_running_stats=False
x=torch.rand(2,16,1,1)
print(bn(x).view(2,-1))

k=bn.weight/torch.sqrt(bn.running_var+1e-05)
h=bn.bias-bn.running_mean*bn.weight/torch.sqrt(bn.running_var+1e-05)

print((x.view(2,16)*k+h).view(2,-1))

and the two outputs are mismatched. I wonder why this happens, Thanks!