If my model has dropout, do I have to alternate between model.eval() and model.train() during training?

Nikronic · May 26, 2020, 10:08pm

Hi,

About this question the answer is yes. The reason is that when you set model.eval() PyTorch removes all dropout layers (do not update mean/variance in batch norm). Here is a small snippet to test:

class Test(nn.Module):
    def __init__(self, pool_size=(4, 4)):
        super(Test, self).__init__()
       
        self.layer = nn.Linear(1, 1, bias=False)
        self.dropout = nn.Dropout()
   
    def forward(self, x):
        out = self.dropout(self.layer(x))
        return out

model.train()
for i in range(10):
    print(model(x))

If you run this code, few times (about 1/2) you will get 0 in the output. The reason is that I put 1 single neuron (self.layer) and a dropout with probability of 0.5, so, it will zero the output of tensor with probability of 0.5.

About

This is not true in your example. Only inference time during training is using validation set which is still validation not training. If you set model.eval() then get prediction of your models, you are not using any dropout layers or updating any batchnorm so, we can literally remove all of these layers. As you know, in case of dropout, it is a regularization term to control weight updating, so by setting model in eval mode, it will have no effect.

Bests
Nik