Performance decreases after saving and reloading the model

kenmikanmi · February 20, 2018, 6:45am

Hi, I trained the following model:

import torch.nn as nn
import torch

class LambdaBase(nn.Sequential):
    def __init__(self, fn, *args):
        super(LambdaBase, self).__init__(*args)
        self.lambda_func = fn

    def forward_prepare(self, input):
        output = []
        for module in self._modules.values():
            output.append(module(input))
        return output if output else input

class Lambda(LambdaBase):
    def forward(self, input):
        return self.lambda_func(self.forward_prepare(input))

model = nn.Sequential( # Sequential,
	nn.Conv2d(3,64,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Conv2d(64,64,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Dropout(0.25),
	nn.MaxPool2d((4, 4),(4, 4)),
	nn.BatchNorm2d(64,0.001,0.9,True),
	nn.Conv2d(64,128,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Conv2d(128,128,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Dropout(0.25),
	nn.MaxPool2d((4, 4),(4, 4)),
	nn.BatchNorm2d(128,0.001,0.9,True),
	nn.Conv2d(128,256,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Conv2d(256,256,(3, 3),(1, 1),(1, 1)),
	nn.ReLU(),
	nn.Dropout(0.25),
	nn.MaxPool2d((4, 4),(4, 4)),
	nn.BatchNorm2d(256,0.001,0.9,True),
	nn.Conv2d(256,128,(1, 1)),
	nn.ReLU(),
	Lambda(lambda x: x.view(x.size(0),-1)), # Reshape,
	nn.Sequential(Lambda(lambda x: x.view(1,-1) if 1==len(x.size()) else x ),nn.Linear(3072,128)), # Linear,
    #nn.Linear(3072,128)
)

Then I load the parameters and train the model like this:

model.load_state_dict(torch.load("pretrained_model.pth"))

for i in epochs:
    if(i % 100 == 0):
        model.train(False)
        ...validation process...
        if(current_score > best_score):
            torch.save(model.state_dict(), "best_model.pth")

    model.train(True)
    ...training process...

But, when I reload the saved model best_model.pth, it shows low performance as model before training, though it showed best performance when training.

Following is how performance changes:

before training・・・70% accuracy
after training・・・87% accuracy => OK, that’s best score and I save the model as "best_model.pth"
after loading "best_model.pth"(expected to have 87% accuracy)・・・70% accuracy

Do you know why this thing occurs?

Thanks.

ptrblck · February 22, 2018, 12:19pm

How did you measure the performance?
Have you set the model to .eval() before calculating the performance?

kenmikanmi · February 22, 2018, 9:08pm

I used model.train(False) instead, then evaluated and saved the model by torch.save(model.state_dict, "weights.pth").

Is model.eval() needed before saving?

Thanks

ptrblck · February 22, 2018, 9:09pm

No, you should use it for calculating the accuracy of the validation set. Did you set it again after loading the model?

kenmikanmi · February 22, 2018, 9:25pm

No, you should use it for calculating the accuracy of the validation set.

I got it. I’ll do this way.

Did you set it again after loading the model?

Could you tell me what “it” mean please?
When I loaded saved model, I did following way:

model = MyModel()
model.load_state_dict...
model.train(True)
...learning...
model.train(False)
...evaluating...

ptrblck · February 22, 2018, 11:54pm

Sorry for being not clear enough.
I mean setting your model to eval.

So let me understand your workflow.
You train your model and the accuracy is good for the training set.
You evaluate this model, the accuracy is still good, and you save it.
After loading the model you train it again.
How is the training accuracy then?
When evaluating again the accuracy is bad.

Did you use an adaptive optimizer (Adam, etc.)?
If so, did you take care of lowering the learning rate?
Did you also save the optimizer and reloaded it?

kenmikanmi · February 23, 2018, 10:50am

Following is answer for your reply:

You train your model and the accuracy is good for the training set. => Yes
You evaluate this model, the accuracy is still good, and you save it. => Yes(achieved 87% accuracy)
After loading the model you train it again. => No, I only loaded the model
When evaluating again the accuracy is bad. => Yes(70% accuracy)

Did you use an adaptive optimizer (Adam, etc.)?
If so, did you take care of lowering the learning rate?
Did you also save the optimizer and reloaded it?

I used Adadelta for optimization, but I wanted to use the saved model as feature extractor, so I don’t retrain the model.

Diego · February 24, 2018, 6:26pm

Once you loaded the model did you make sure to set it in eval mode?
e.g: model.eval() or model.train(False)

kenmikanmi · February 24, 2018, 7:05pm

Hi, thanks for dealing with it.
Yes, I set model.eval() before computing accuracy.

0bff83efac608c536648 · May 29, 2019, 2:40am

I met the same problem on pytorch v1.0, I got auc of 0.73 on validation set but when I load the model, I only got auc of 0.51 on the same data. and I also called model.eval() before calculate the auc metric. I still don’t know why.

ptrblck · May 29, 2019, 10:41am

Could you post a (small) executable code snippet so that we could have a look at this issue?

0bff83efac608c536648 · May 30, 2019, 6:36am

Sorry, when I retrain the model and restore it, the bug doesn’t appear any more. If it appears again, I will post the code.

ptrblck via PyTorch Forums noreply@discuss.pytorch.org 于2019年5月29日周三下午6:51写道：

0bff83efac608c536648 · June 12, 2019, 1:55am

hi ptrblck, I got the reason why my model got worse when I load the model.It was because that I didn’t save the word2idx dict, so when I loaded the model, the word2idx dict are not persistent with the word2idx when model was trained. So the problem is solved when I save the word2idx also. Thanks, it was my fault.

ptrblck · June 12, 2019, 9:50am

Great it’s working and thanks for getting back!

0bff83efac608c536648 · July 8, 2019, 2:51am

hi ptrblck, could you help me to take a look at the topic Deploy pytorch model on spark . thanks very much.

ptrblck · July 8, 2019, 10:40am

No, sorry, unfortunately I’m inexperienced in Spark.

0bff83efac608c536648 · July 8, 2019, 10:43am

thanks all the same~

arod40 · March 18, 2020, 6:31pm

I know this is an old post, but it turns Im experiencing the same issue now. It is not clear to me what you refer with word2idx dict, is this some kind of embedding in your model?

0bff83efac608c536648 · March 18, 2020, 10:29pm

it’s a dict of word and it’s index for putting it into embedding layer. like {‘trump’:0, ‘is’:1, ‘shit’:2,}

Alejandro Rodriguez Perez via PyTorch Forums noreply@discuss.pytorch.org 于2020年3月19日周四上午2:42写道：

arod40 · March 19, 2020, 3:06pm

Thanks for your response. I actually figured out the reason for the undeterministic behaviour. It was the dataset, some indices I was creating to feed a char-embedding layer were being constructed non-deterministically.