I’m transforming a TensorFlow model to Pytorch. And I’d like to initialize the mean and variance of BatchNorm2d using TensorFlow model.
I’m doing it in this way:
bn.running_mean = torch.nn.Parameter(torch.Tensor(TF_param))
And I get this error:
RuntimeError: the derivative for 'running_mean' is not implemented
But is works for bn.weight
and bn.bias
. Is there any way to initialize the mean and variance using my pre-trained Tensorflow model? Is there anything like moving_mean_initializer
and moving_variance_initializer
in Pytorch?
Thanks!
Could you try to assign a torch.tensor
instead of an nn.Parameter
, since the running estimates do not require gradients?
1 Like
WoW! It worked! Thank you! By the way, if I freeze bn parameters (stop bn from updating), the running_mean
and running_var
will not change. The running_mean
and running_var
will be saved in model directly. Am I right? Sorry for bad English expression.
WoW! It worked! Thank you! By the way, if I freeze bn parameters (stop bn from updating), the running_mean
and running_var
will not change. The running_mean
and running_var
will be saved in model directly. Am I right? Sorry for bad English expression.
It depends on what you mean by “freezing”.
To use the running estimates without updating, you could simply call .eval()
on the batchnorm layer.
If you would like to freeze the affine parameters (weight
and bias
), you would need to set their requires_grad
attribute to False
.
Thanks. I meant requires_grad
.