What does `optimizer.state_dict()` store?

I found about optimizer.state_dict, but I am not able to understand what this data is? Because while using Adam we don’t need to store anything else in memory. It computes rolling average.
If I want to continue training later, should I also save optimizer state dict.

An optimizer’s state dictionary contains two types of information - parameters that are being optimized and any hyperparameters in use.

For correct model loading in order to resume your training, you should indeed be saving (and re-loading) the state_dict of your model AND the Adam optimizer.

Hope this helps - here’s a few links that might assist.

  1. torch.optim.Optimizer.state_dict — PyTorch 1.10 documentation
  2. deep learning - What are saved in optimizer's state_dict? what "state","param_groups" stands for? - Stack Overflow
1 Like