In Caffe we can set learning rate as 0 by using ‘lr_mult: 0’ . It means only the mean/var are calculating , but no parameter is learnt in training.
In Pytorch, Is “affine=False” the same as “lr_mult:0”?
nn.BatchNorm3d(num_features=64, momentum=0.999, affine=False)
affine=False will remove the
beta terms from the calculation, thus only using the running mean and var. So that’s basically, what you want.
I don’t know, how Caffe works, but setting the learning rate to 0 is something different in my opinion, since you still could have the
beta terms with constant (random) values.
Also, be careful with the
momentum argument, since it is different from the one used in optimizer classes and the conventional notion of momentum. Have a look at this note. You probably want to lower it, i.e. setting it to something like
Thank you very much!
Caffe use base_lr to set base learning rate. Learning rate of a param is base_lr*lr_mult, so parameter in the layer can be set separately in Caffe. But maybe I misunderstand the meaning of some params. I’ll check it again!
I check param again， I was wrong about the meaning of mult_lt=0.
I noticed that track_running_stats is a new parameter in version 0.4, I did’t see it in previous version.
The doc said that this module tracks the running mean and variance, and when set to False, this module does not track such statistics and always uses batch statistics in both training and eval modes.
What is the meaning of “tracks the running mean and variance”? It did’t do it in previous version?
track_running_stats is set to
True, the vanilla batch norm is used, i.e. the batch statistics are stored furing training in
running_var. Setting the model to evaluation mode, will use these statistics for all validation samples.
False will use the batch statistics even in evaluation mode of the current batch.