How to initialize mean and variance of BatchNorm2d?

I’m transforming a TensorFlow model to Pytorch. And I’d like to initialize the mean and variance of BatchNorm2d using TensorFlow model.
I’m doing it in this way:

bn.running_mean = torch.nn.Parameter(torch.Tensor(TF_param))

And I get this error:

RuntimeError: the derivative for 'running_mean' is not implemented

But is works for bn.weight and bn.bias. Is there any way to initialize the mean and variance using my pre-trained Tensorflow model? Is there anything like moving_mean_initializer and moving_variance_initializer in Pytorch?
Thanks!

Could you try to assign a torch.tensor instead of an nn.Parameter, since the running estimates do not require gradients?

1 Like

WoW! It worked! Thank you! By the way, if I freeze bn parameters (stop bn from updating), the running_mean and running_var will not change. The running_mean and running_var will be saved in model directly. Am I right? Sorry for bad English expression.

WoW! It worked! Thank you! By the way, if I freeze bn parameters (stop bn from updating), the running_mean and running_var will not change. The running_mean and running_var will be saved in model directly. Am I right? Sorry for bad English expression.

It depends on what you mean by “freezing”.
To use the running estimates without updating, you could simply call .eval() on the batchnorm layer.
If you would like to freeze the affine parameters (weight and bias), you would need to set their requires_grad attribute to False.

Thanks. I meant requires_grad.