I have two network, they shared some layers.
For example, F and F1 compose net1, F and F2 compose net2, net1 and net2 compose net.
I first trained F and F1, then I make F.weight and F1.weight are Flase, I begin to train F2,but
every epoch ,I find the accuracy of net1 always change a little, but when I set net.eval() to train F2, the accuracy no change .
However, every epoch when I test the accuracy of net1 ,I used net.eval()
so I do not know why this happen
MIght be just me, but I found this kind of hard to follow. Maybe you can make a really simple example, using just like one linear layer or something, and random data (torch.rand), so it’s easy to reproduce?
Is there batchnorm in either network? Batchnorm updates its running averages when run in training mode, even when you don’t use an optimizer.
ok,I now try it, and to see what will happen
Yes, have BN layer, but the weight in BN layer also is False, how does it change?
There’s the learned weight and bias, but also a separate running mean and running variance that it keeps track of outside of any optimizer. If you want these not to update you need to set the BatchNorm module’s momentum to 0.
1 Like
Thinks, that is the key