For my current use case, I would like BatchNorm to behave as though it is in inference mode and not training (just BatchNorm and not the whole network).
I notice from Pytorch documentation that
track_running_stats: a boolean value that when set to ``True``, this
module tracks the running mean and variance, and when set to ``False``,
this module does not track such statistics and always uses batch
statistics in both training and eval modes. Default: ``True``
Now I was under the impression that BatchNorm at test time/eval time would not compute anything on the fly but use previously stored mean/variance during training. The running stats flag mentions that even if set to False, it will use batch statistics in test mode.
When I see this:
http://cs231n.stanford.edu/slides/2019/cs231n_2019_lecture07.pdf
it mentions that at test time BatchNorm can be fused because there is no separate calculation performed at test time. (from the slides quote: "during testing batchnorm becomes a linear operator! Can be fused with the previous fully-connected or conv layer "). So if I do this in Pytorch
def forward(self, x):
.
.
.
self.bn = BatchNorm2d(2, track_running_stats=False)
.
.
and after instantiating the net object, I do
net.bn.eval(),
will this calculate batch mean and variance when an input batch is applied because the documentation says so? How do I not calculate stats in test mode? Basically how to ensure that in eval mode, it uses only previous stats and does not use anything from the current batch?