Keep the momentum in weights file

Hi, I have a small question

For example with SGD momentum, the momentum will be stacked by this formula (follow Pytorch source code):

v{t+1} & = momentum * vt + g{t+1}  

Does Pytorch keep the momentum from optimizer in pth file? Or we have to reset the momentum every time we train again?

optimizer state is kept separately, you can save & reload it, see optimizer.load_state_dict()

@googlebot Thank you for the fast reply,

I have another question, so if we use the pre-trained model for the new dataset, for example from ResNet50 trained by ImageNet dataset, so the momentum (in both cases of SGD and Adam) and the num_iterations (in case of Adam) are both a large number since the first epoch of new dataset, is it correct?

Hm, I don’t think you would reload optimizer state in that case, as old gradient history no longer applies.

Understood, thanks for your help.