What if I trained a model for 50 epochs, but notice the model starts to overfit at the 40th epoch. How can I save/load the model weights at the 40th epoch?
path = os.path.join(SAVE_DIR, 'model.pth')
torch.save(MODEL.cpu().state_dict(), path) # saving model
MODEL.cuda() # moving model to GPU for further training
If you save it into another buffer or file it will not overwrite the previous one.
So if you follow the recommended approach@alwynmathew mentioned, you can for example use the number of the current epoch in the filename.
Example: model is the model to save epoch is the counter counting the epochs model_dir is the directory where you want to save your models in
For example you can call this for example every five or ten epochs.
I also found examples in the documentation which use .pt instead of .pth, for example here, but also some that use .pth in examples.
It is just a name, but is somewhere one file-suffix explicitely recommended?