The training output of a simple pytorch model is different with evaluating

I’m confused about why the training output is different from evaluating. I printed out the parameters after training and during evaluating to make sure they were the same. why the output is different?

The python version is 3.6, PyTorch is 1.4, Cuda version is 10.0.

The codes are the following. Are there any mistakes in the codes? Please point out the problem, thank you very much!

import torch
import torch.nn as nn
import random
import numpy as np


class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        self.linear = nn.Linear(10, 2)

    def forward(self, x):
        output = self.linear(x)

        return output


if __name__ == "__main__":
    random.seed(2020)
    np.random.seed(2020)
    torch.manual_seed(2020)
    torch.cuda.manual_seed_all(2020)
    torch.backends.cudnn.deterministic = True

    model = Classifier().to('cuda')

    optimizer = torch.optim.Adam(model.parameters(), lr=3e-4)
    criterion = nn.CrossEntropyLoss()

    optimizer.zero_grad()
    model.zero_grad()

    data = torch.randn((4, 10)).to('cuda')
    label = torch.tensor([1, 0, 1, 0]).long().to('cuda')
    print('data', data)
    output = model(data)
    print('output', output)
    print('*'*20)

    loss = criterion(output, label)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    model.zero_grad()

    for p in model.parameters():
        print('p', p)
    print('*' * 20)

    model.eval()
    with torch.no_grad():
        output = model(data)

    print('data', data)
    print('output', output)
    print('*' * 20)

    for p in model.parameters():
        print('p', p)
    print('*' * 20)

That’s because you are printing the output before updating the parameters.
Once you update the parameters, the output is computed with updated parameters.

Thank you!

I print the output after updating the parameters, however, the two outputs still different.

    loss = criterion(output, label)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    model.zero_grad()

    print('output', output)

    model.eval()
    with torch.no_grad():
        output = model(data)

    print('output', output)

After updating parameters, you have to still compute the output with updated parameters.
i.e., output = model(data)

1 Like

Thank you! I understood.