How does pytorch’s batch norm know if the forward pass its doing is for inference or training?

Alpha · April 24, 2018, 1:06am

I see.

As far as I know, the flag self.training just make a difference in the Dropout and Batch Norm layers,
so if you don’t use these two types of layer, it may don’t have influence.