Model predictions changing with no_grad and .eval()

drj3122 · July 12, 2021, 7:42pm

My model predictions keep changing even though I have set model.eval() and no_grad(). I am following examples from Natural Language Processing with PyTorch. generate_batches is from the book. demo_model is a class that includes model (torch model) and some other attributes.

demo_model.data.set_split('val')
batch_generator = generate_batches(demo_model.data, batch_size=10, device=demo_model.device, shuffle = False, #demo_model.data._lookup_dict['val'][1]
                                  drop_last = False)
bt = next(batch_generator)
demo_model.model.eval()
with torch.no_grad():    
    y_pred =  demo_model.model(bt['demo_feat'].type(torch.FloatTensor))
bt['demo_feat'], bt['y_feat'], y_pred.flatten()#.sort()

def generate_batches(dataset, batch_size, shuffle=True, drop_last=True, device="cpu"): 
    """
    A generator function which wraps the PyTorch DataLoader. It will 
      ensure each tensor is on the write device location.
    """
    dataloader = DataLoader(dataset=dataset, batch_size=batch_size, shuffle=shuffle, drop_last=drop_last)

    for data_dict in dataloader:
        out_data_dict = {}
        for name, tensor in data_dict.items():
            out_data_dict[name] = data_dict[name].to(device)
        yield out_data_dict

My model is a simple 2 layer model with dropout and relu.

class Demo(nn.Module):

    def __init__(self):
        super(Demo, self).__init__()
        self.agefc1 = nn.Linear(2, 20)
        self.agefc2 = nn.Linear(20, 1)

    def forward(self, demo_in):        
        demo_in = self.agefc1(demo_in)
        demo_in = F.relu(F.dropout(demo_in, p = .5))
        demo_in = self.agefc2(demo_in)
        demo_in = F.dropout(demo_in, p = .5)

        return demo_in

Example output (2 separate runs in jupyter lab):

(tensor([[ 1.1856,  0.0000],
         [ 0.6062,  0.0000],
         [-0.4473,  0.0000],
         [ 1.0276,  1.0000],
         [ 1.5016,  0.0000],
         [ 0.1848,  0.0000],
         [ 0.8695,  0.0000],
         [ 1.9757,  1.0000],
         [-0.9740,  0.0000],
         [-1.3427,  0.0000]], dtype=torch.float64),
 tensor([712.5783,  69.1400, 212.7758,  53.3617, 586.6400,  36.5867,  59.6350,
         482.8508, 450.3008, 163.0558], dtype=torch.float64),
 tensor([43.3376,  0.0000, 16.7106, 48.0151, 50.6728, 43.9258, 28.6625, 10.7091,
          4.8271,  0.0000]))

(tensor([[ 1.1856,  0.0000],
         [ 0.6062,  0.0000],
         [-0.4473,  0.0000],
         [ 1.0276,  1.0000],
         [ 1.5016,  0.0000],
         [ 0.1848,  0.0000],
         [ 0.8695,  0.0000],
         [ 1.9757,  1.0000],
         [-0.9740,  0.0000],
         [-1.3427,  0.0000]], dtype=torch.float64),
 tensor([712.5783,  69.1400, 212.7758,  53.3617, 586.6400,  36.5867,  59.6350,
         482.8508, 450.3008, 163.0558], dtype=torch.float64),
 tensor([37.5727,  0.0000, 19.0388, 57.4453,  0.0000,  0.0000, 35.3358, 46.8472,
         32.5002, 22.6629]))

I’m guessing I’m missing something simple.

drj3122 · July 13, 2021, 12:32am

As expected, it was very simple. F.dropout doesn’t disable dropout for eval mode. From stackoverflow:

Both are completely equivalent in terms of applying dropout and even though the differences in usage are not that big, there are some reasons to favour the nn.Dropout over nn.functional.dropout:
Dropout is designed to be only applied during training, so when doing predictions or evaluation of the model you want dropout to be turned off.
The dropout module nn.Dropout conveniently handles this and shuts dropout off as soon as your model enters evaluation mode, while the functional dropout does not care about the evaluation / prediction mode.

Code update:

class Demo(nn.Module):

    def __init__(self):
        super(Demo, self).__init__()
        self.agefc1 = nn.Linear(2, 20)
        self.agefc2 = nn.Linear(20, 1)
        self.drop_layer = nn.Dropout(p = .5)

    def forward(self, demo_in):        
        demo_in = self.agefc1(demo_in)
        demo_in = F.relu(self.drop_layer(demo_in))
        demo_in = self.agefc2(demo_in)
        demo_in = self.drop_layer(demo_in)

        return demo_in

ptrblck · July 13, 2021, 8:41am

The StackOverflow post is unfortunately missing the proper usage of the functional API and sets the training argument of F.dropout to fixed True or False values.
To properly enable/disable F.dropout during training and validation use out = F.dropout(x, training=self.training), since the self.trainng attribute will be switched by calling model.train() and model.eval(). The nn.Dropout module does indeed the same as seen here.