Performance decreases after saving and reloading the model

Hi, I trained the following model:

import torch.nn as nn
import torch

class LambdaBase(nn.Sequential):
    def __init__(self, fn, *args):
        super(LambdaBase, self).__init__(*args)
        self.lambda_func = fn

    def forward_prepare(self, input):
        output = []
        for module in self._modules.values():
            output.append(module(input))
        return output if output else input

class Lambda(LambdaBase):
    def forward(self, input):
        return self.lambda_func(self.forward_prepare(input))

model = nn.Sequential( # Sequential,
	nn.Conv2d(3,64,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Conv2d(64,64,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Dropout(0.25),
	nn.MaxPool2d((4, 4),(4, 4)),
	nn.BatchNorm2d(64,0.001,0.9,True),
	nn.Conv2d(64,128,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Conv2d(128,128,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Dropout(0.25),
	nn.MaxPool2d((4, 4),(4, 4)),
	nn.BatchNorm2d(128,0.001,0.9,True),
	nn.Conv2d(128,256,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Conv2d(256,256,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Dropout(0.25),
	nn.MaxPool2d((4, 4),(4, 4)),
	nn.BatchNorm2d(256,0.001,0.9,True),
	nn.Conv2d(256,128,(1, 1)),
	nn.ReLU(),
	Lambda(lambda x: x.view(x.size(0),-1)), # Reshape,
	nn.Sequential(Lambda(lambda x: x.view(1,-1) if 1==len(x.size()) else x ),nn.Linear(3072,128)), # Linear,
    #nn.Linear(3072,128)
)

Then I load the parameters and train the model like this:

model.load_state_dict(torch.load("pretrained_model.pth"))

for i in epochs:
    if(i % 100 == 0):
        model.train(False)
        ...validation process...
        if(current_score > best_score):
            torch.save(model.state_dict(), "best_model.pth")

    model.train(True)
    ...training process...
     

But, when I reload the saved model best_model.pth, it shows low performance as model before training, though it showed best performance when training.

Following is how performance changes:

  • before training・・・70% accuracy
  • after training・・・87% accuracy => OK, that’s best score and I save the model as "best_model.pth"
  • after loading "best_model.pth"(expected to have 87% accuracy)・・・70% accuracy

Do you know why this thing occurs?

Thanks.

1 Like

How did you measure the performance?
Have you set the model to .eval() before calculating the performance?

I used model.train(False) instead, then evaluated and saved the model by torch.save(model.state_dict, "weights.pth").

Is model.eval() needed before saving?

Thanks

No, you should use it for calculating the accuracy of the validation set. Did you set it again after loading the model?

No, you should use it for calculating the accuracy of the validation set.

I got it. I’ll do this way.

Did you set it again after loading the model?

Could you tell me what “it” mean please?
When I loaded saved model, I did following way:

model = MyModel()
model.load_state_dict...
model.train(True)
...learning...
model.train(False)
...evaluating...

Sorry for being not clear enough.
I mean setting your model to eval.

So let me understand your workflow.
You train your model and the accuracy is good for the training set.
You evaluate this model, the accuracy is still good, and you save it.
After loading the model you train it again.
How is the training accuracy then?
When evaluating again the accuracy is bad.

Did you use an adaptive optimizer (Adam, etc.)?
If so, did you take care of lowering the learning rate?
Did you also save the optimizer and reloaded it?

2 Likes

Following is answer for your reply:


You train your model and the accuracy is good for the training set. => Yes
You evaluate this model, the accuracy is still good, and you save it. => Yes(achieved 87% accuracy)
After loading the model you train it again. => No, I only loaded the model
When evaluating again the accuracy is bad. => Yes(70% accuracy)


Did you use an adaptive optimizer (Adam, etc.)?
If so, did you take care of lowering the learning rate?
Did you also save the optimizer and reloaded it?

I used Adadelta for optimization, but I wanted to use the saved model as feature extractor, so I don’t retrain the model.


Once you loaded the model did you make sure to set it in eval mode?
e.g: model.eval() or model.train(False)

1 Like

Hi, thanks for dealing with it.
Yes, I set model.eval() before computing accuracy.

I met the same problem on pytorch v1.0, I got auc of 0.73 on validation set but when I load the model, I only got auc of 0.51 on the same data. and I also called model.eval() before calculate the auc metric. I still don’t know why.

Could you post a (small) executable code snippet so that we could have a look at this issue?

Sorry, when I retrain the model and restore it, the bug doesn’t appear any more. If it appears again, I will post the code.

ptrblck via PyTorch Forums noreply@discuss.pytorch.org 于2019年5月29日周三 下午6:51写道:

hi ptrblck, I got the reason why my model got worse when I load the model.It was because that I didn’t save the word2idx dict, so when I loaded the model, the word2idx dict are not persistent with the word2idx when model was trained. So the problem is solved when I save the word2idx also. Thanks, it was my fault.

Great it’s working and thanks for getting back! :slight_smile:

hi ptrblck, could you help me to take a look at the topic Deploy pytorch model on spark . thanks very much.

No, sorry, unfortunately I’m inexperienced in Spark.

thanks all the same~

I know this is an old post, but it turns Im experiencing the same issue now. It is not clear to me what you refer with word2idx dict, is this some kind of embedding in your model?

it’s a dict of word and it’s index for putting it into embedding layer. like {‘trump’:0, ‘is’:1, ‘shit’:2,}

Alejandro Rodriguez Perez via PyTorch Forums noreply@discuss.pytorch.org 于2020年3月19日周四 上午2:42写道:

Thanks for your response. I actually figured out the reason for the undeterministic behaviour. It was the dataset, some indices I was creating to feed a char-embedding layer were being constructed non-deterministically.