Loaded model is not the same

CedricLy · December 28, 2020, 3:17pm

Hi everyone,

I have a problem wih loading my model to continue the training.
I have a minimalistic example:

for epoch in range(...):
    train model
for k, data in enumerate(train_loader):
    preparing inputs
    output = model(input1.float(), input2.float())
    torch.save(model,'output/model_saved')
    break
model2 = torch.load('output/model_saved')
print('Get the parameters states values')
for key in model2.state_dict().keys():
    print(torch.equal(model.state_dict()[key],model2.state_dict()[key]))

print('Compare outputs')
print(torch.equal(output,model2(input1.float(), input2.float())))

The loaded state are the same.
The comparison of the outputs are different.
How can the outputs between model and model2 be different if the model parameters and the inputs are the same?

CedricLy · December 30, 2020, 4:24am

Ok, the problem seems to be known already.
By saving the model it saved it in eval() mode.
The outputs will be the same for the eval() mode.

Unfortunately I calculate the loss in model.train().
So my new question would be, if there is a way to continue training, without having a “kink” in the loss values?

ptrblck · January 7, 2021, 6:34pm

The different outputs in training mode could come from the usage of e.g. dropout layers or any other layers, which change their behavior between training and evaluation.

CedricLy · January 7, 2021, 8:51pm

Yes, this is indeed the issue.
One style question:
Is it something like a unsoken rule, to calculate per epoch the loss in model.eval(),
in order to have a plot wihtout this behavior?

ptrblck · January 7, 2021, 10:51pm

You would usually use model.eval() to calculate the epoch loss using the validation set, as this would also be the workflow for the model deployment. Using the model in training mode during validation and test would leak the data information e.g. into the batchnorm statistics, and could create a biased loss estimation for “real unseen” data. Also, it’s usually not desired to apply dropout during validation/testing/deployment.