I’m transforming a TensorFlow model to Pytorch. And I’d like to initialize the mean and variance of BatchNorm2d using TensorFlow model.
I’m doing it in this way:
RuntimeError: the derivative for 'running_mean' is not implemented
But is works for bn.weight and bn.bias. Is there any way to initialize the mean and variance using my pre-trained Tensorflow model? Is there anything like moving_mean_initializer and moving_variance_initializer in Pytorch?
Thanks!
WoW! It worked! Thank you! By the way, if I freeze bn parameters (stop bn from updating), the running_mean and running_var will not change. The running_mean and running_var will be saved in model directly. Am I right? Sorry for bad English expression.
WoW! It worked! Thank you! By the way, if I freeze bn parameters (stop bn from updating), the running_mean and running_var will not change. The running_mean and running_var will be saved in model directly. Am I right? Sorry for bad English expression.
It depends on what you mean by “freezing”.
To use the running estimates without updating, you could simply call .eval() on the batchnorm layer.
If you would like to freeze the affine parameters (weight and bias), you would need to set their requires_grad attribute to False.