What does `optimizer.state_dict()` store?

Yash_Rathi · February 8, 2022, 8:13am

Hello
I found about optimizer.state_dict, but I am not able to understand what this data is? Because while using Adam we don’t need to store anything else in memory. It computes rolling average.
If I want to continue training later, should I also save optimizer state dict.

osama-usuf · February 9, 2022, 3:37am

An optimizer’s state dictionary contains two types of information - parameters that are being optimized and any hyperparameters in use.

For correct model loading in order to resume your training, you should indeed be saving (and re-loading) the state_dict of your model AND the Adam optimizer.

Hope this helps - here’s a few links that might assist.

torch.optim.Optimizer.state_dict — PyTorch 1.10 documentation
deep learning - What are saved in optimizer's state_dict? what "state","param_groups" stands for? - Stack Overflow