Keep the momentum in weights file

User4 · December 29, 2020, 2:10pm

Hi, I have a small question

For example with SGD momentum, the momentum will be stacked by this formula (follow Pytorch source code):

v{t+1} & = momentum * vt + g{t+1}

Does Pytorch keep the momentum from optimizer in pth file? Or we have to reset the momentum every time we train again?

googlebot · December 29, 2020, 2:30pm

optimizer state is kept separately, you can save & reload it, see optimizer.load_state_dict()

User4 · December 29, 2020, 2:45pm

@googlebot Thank you for the fast reply,

I have another question, so if we use the pre-trained model for the new dataset, for example from ResNet50 trained by ImageNet dataset, so the momentum (in both cases of SGD and Adam) and the num_iterations (in case of Adam) are both a large number since the first epoch of new dataset, is it correct?

googlebot · December 29, 2020, 2:58pm

Hm, I don’t think you would reload optimizer state in that case, as old gradient history no longer applies.

User4 · December 30, 2020, 1:50am

Understood, thanks for your help.