If I have a model class and a trainer class. I create an instance of the model and train it.
model = mymodel()
train = trainer.train(model...)
How can I save the model to a file, after it has been trained and how can I then reload it and continue training? I searched for this but didn’t get an answer.
@Rinku_Jadhav2014 unfortunately that tutorial is incomplete to resume training. It will only allow saving a model but it does not save the optimizer, epochs, score, etc.
Hi! the best and safe way to save your model parameters is doing something like this:
model = MyModel()
# ... after training, save your model
model.save_state_dict('mytraining.pt')
# .. to load your previously training model:
model.load_state_dict(torch.load('mytraining.pt'))
@diegslva Unfortunately this has the same issue as the tutorial, it won’t save the epoch and the optimizer state so you can’t resume training which was the OP need.
@mratsim, You’re right! I made a mistake here understanding the question.
I don’t use to do that but, maybe something dirty like that to save you entirely objects:
import copy
import pickle
# model stuff
model = mymodel()
train = trainer.train(model...)
# copy you entirely object and save it
saved_trainer = copy.deepcopy(train)
with open(r"my_trainer_object.pkl", "wb") as output_file:
pickle.dump(saved_trainer, output_file)
@mratsim & @diegslva, when I want to save the trained (i.e., fine tuned) models of ResNet and DenseNet the torch.save(MyModel.state_dict(), './model.pth') method doesn’t work correctly; and when I used the torch.save(MyModel, './model.pth') then the models are saved correctly. It means that when I load my saved models via the first approach, my models don’t give me correct results, however when I use the second approach the results are good. Am I correct? would you please explain why this issue occurred?
You’re right. It matters only when you use those layers, as described in the document. In theory, BN/Dropout should behave differently in evaluation time so you need manually toggle the setting. You could alternatively use model.train(False). Also, make sure to use eval() at validation time.
I have a question about the behaviour of dropout layer during training and evaluation. I remember reading in a paper that because dropout leave out some units during training. During evaluation, the out going weights of dropout layer need to be reduced an amount corresponding to the dropout rate. For instance, if the dropout rate is 0.5, then the out-going weights need to be reduced by 2, because during evaluation, we effectively have twice the number of units.
So my question is, is this kind of weight scaling mechanism included in the dropout layer in pytorch as well?
It is true that model.eval() takes care of this. However, it scales when training.
Furthermore, the outputs are scaled by a factor of 1/(1-p) during training. This means that during evaluation the module simply computes an identity function.