How do you properly load weights while training?

alx · May 18, 2020, 5:55pm

Hi, I’d like to load the best weights of my model instead of stopping the whole training when overfitting.

Right now, I’m doing something similar to:

if overfit is True:
     model.load_state_dict(torch.load(path_of_best_weights))

However, when I look at my accuracy I don’t see any difference. It seems like its continuing the training as if nothing happened.

The only thing that worked was to reload the whole model and then load on my best weights:

model = models.resnet18(pretrained=True)
model.load_state_dict(torch.load(path_of_best_weights))

Is there a better way of doing this?

ptrblck · May 20, 2020, 6:40am

That’s a weird issue, as both approaches should be equal, if you are loading the state_dict from a file.
Just to make sure there is no error in the first approach, since you mentioned “similar to”:
you should either load the state_dict from a file or create a copy.deepcopy of the state_dict.
Otherwise the state_dict will hold the references to the parameters and you will just load the same parameters.

alx · May 21, 2020, 7:39pm

Interesting, I started digging into backing up the optimizer as well because I wasn’t. But even then, its still not working.

If I asked you to revert back to how a model was at epoch x how would you do it?

I was even thinking about emptying the gpu and re-adding the model with the state-dict because it works perfectly when I restart my notebook but not when load the state dict while I’m training.

Even if I do something like:

if epoch == 50:
   new_model = models.resnet18(pretrained=True)
   new_model.load_state_dict(torch.load('weights_from_epoch_40'))
   new_model.to(DEVICE)

Then epoch 51 will have the same accuracy as if I never loaded the weights from epoch 40

ptrblck · May 22, 2020, 7:17am

Could you post your notebook, so that we could have a look?