HI, I am using Pytorch Lightning, trying to restore a model, I have de model_epoch=15.ckpt file and would like to restore from here, so I introduced the resume_from_checkpoint in the trainer, but I get the following error:
Trying to restore training state but checkpoint contains only the model. This is probably due to ModelCheckpoint.save_weights_only
being set to True
.
This is because I put save_weights_only=True in the ModelCheckpoint.
What should I do to restore the model?
Here is my code:
def main():
exp_folder = get_experiment_folder()
train_dataset = Hotelimages(split='train')
valid_dataset = Hotelimages(split='valid')
batch_size = 32
train_dl = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
valid_dl = DataLoader(valid_dataset, batch_size=batch_size, shuffle=True)
model = SwinTransformerFineTuning()
model_checkpoint = ModelCheckpoint(exp_folder, save_last=True, filename='model_{epoch}',
save_weights_only=True, every_n_epochs=5)
trainer = pl.Trainer(precision=16, default_root_dir=exp_folder, callbacks=[model_checkpoint], max_epochs=100, accelerator='gpu', devices=1,
resume_from_checkpoint=run_folder/'1/model_epoch=14.ckpt')
trainer.fit(model, train_dl, valid_dl)