State_dict() changes with the model

Hi all, I want to save the best model and then load it during the test. So I use the following method:

def train():
#training steps …
if acc > best_acc:
best_acc = acc
best_state = model.state_dict()
return best_state

then in the main function I use:
model.load_state_dict(best_state)
to resume the model.
However, I found that best_state is always the same as the last state during training, not the best state. Is anyone know the reason and how to avoid it? (Version: 1.1.0, Linux, GPU)

By the way, I know I can use torch.save(the_model.state_dict(), PATH) and then load the model by
the_model.load_state_dict(torch.load(PATH)).
However, I don’t want to save the parameters to file as train and test functions are in one file.

I don’t remember right now but it may map to same memory address. Try to make a deep copy of state edict when you save it

Hey there! Welcome to the community!

if acc > best_acc:
  best_acc = acc # you missed this
  best_state = model.state_dict()
  . . . 

Happy Coding :man_technologist:

Thanks for your answer, I have update the sentence in the question. I have best_acc = acc in my code, it doesn’t work.

Please paste the code properly in ```

P.S. I see that the
return best_state
is also there, are you sure you are returning after several epochs? And not just after 1 epoch? Or whenever the acc > best_acc ? ( It can’t be figured out as you have no indentations in your question.)

Make a deep copy of state_dict ?

best_model = copy.deepcopy(model.state_dict())

3 Likes

Yeah! Since you are not saving it to a file, might be the case that @god_sp33d and @JuanFMontesinos is right, that you need to make a deepcopy. DeepCopy should work, if you have no leaks elsewhere in the code.

1 Like

Thanks, deepcopy works

1 Like

I should use deepcopy, otherwise the saved state keeps changing with training

1 Like