That’s a weird issue, as both approaches should be equal, if you are loading the state_dict from a file.
Just to make sure there is no error in the first approach, since you mentioned “similar to”:
you should either load the state_dict from a file or create a copy.deepcopy of the state_dict.
Otherwise the state_dict will hold the references to the parameters and you will just load the same parameters.
Interesting, I started digging into backing up the optimizer as well because I wasn’t. But even then, its still not working.
If I asked you to revert back to how a model was at epoch x how would you do it?
I was even thinking about emptying the gpu and re-adding the model with the state-dict because it works perfectly when I restart my notebook but not when load the state dict while I’m training.
Even if I do something like:
if epoch == 50:
new_model = models.resnet18(pretrained=True)
new_model.load_state_dict(torch.load('weights_from_epoch_40'))
new_model.to(DEVICE)
Then epoch 51 will have the same accuracy as if I never loaded the weights from epoch 40