I am having the same issues Pytorch 0.4.1. If I save the model the same way as it is explain in the example Imagenet traning at pytorch Github and load the model afterwards, I can see, that it yields different results than it was yielding previously.
I am using train loop like this:
scheduler = StepLR(optimizer, step_size=config.lr_step_size, gamma=config.lr_gammma)
for epoch in range(epochs):
train(model) # In the train I set model.train() first
test(model) # In the test I set model.eval() first
# Then I save the model chekpoint (optimizer, epoch, best_acc, model)
After when it is saved, I want to load the model. First I create the exact same model “structure” and optimizer, and after that I
load_state_dict() for each model/optimizer. Is there a difference between saving model in
.eval() mode and
.train() mode? Imho this shouldn’t be a problem, right?
Or should I set the model to
.eval() before going to the train loop after the resuming?
Can be the culprit in the lr_scheduler?
Is there someone who also had these issues? How did you fix that?
Thank you very much !