I see.
As far as I know, the flag self.training just make a difference in the Dropout and Batch Norm layers,
so if you don’t use these two types of layer, it may don’t have influence.
I see.
As far as I know, the flag self.training just make a difference in the Dropout and Batch Norm layers,
so if you don’t use these two types of layer, it may don’t have influence.