For a general fine-tuning use case, storing the model.state_dict()
might be sufficient, as it’s also done when you are trying to fine-tune e.g. the torchvision
models (you can’t load the optimizer.state_dict()
as they are not available).
However, if you would like to “continue” the training, then you should store the optimizer.state_dict()
as well (additionally, you might also want to store the learning rate scheduler’s state_dict
if used).
1 Like