Loaded model doesn't train anymore

Hi,
I’m working on a code that fine tune Mask-RCNN model on a dataset, then save it, and load the model to integrate it in a GAN framework as generator, thus the Mask-RCNN model that is loaded, has to be trained further.
Anyway when I load the model and launch the training, it doesn’t update like if the parameters are freezed, how can it be possible?
I add a code to better understand:

# save
PATH = "./saved_models/generator"+str(epoch)
torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            }, PATH)

# load 
epoch = 0 # last epoch for which the model was saved

model = get_instance_segmentation_model(num_classes)
optimizer = optimizer = torch.optim.SGD(params, lr=0.005,
                            momentum=0.9, weight_decay=0.0005)

checkpoint = torch.load(PATH, map_location=device)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']


model.train()

Am I missing something? If I train the model without saving it and proceed to the GAN training phase, the model update works and I can see change in results, while If I do the same but with a loaded model, results don’t change.

Can you try to define the optimizer after loading the state dict?
Not sure if the optimizer is pointing to the proper tensors.

Hi @JuanFMontesinos and thanks for your answer.
Do you mean first load the state dict and then define the optimizer? I just tried but nothing.
Loaded model in training phase has zero gradient flow, basically it doesn’t train at all.

My insight was that loading the state dict after defining the optimizer was making the optimizer to update “other” weights (the ones corresponding to the model before replacing them by the loaded ones).

If that’s not the case you will probably have some bug anywhere else.

Yeah maybe you are right and the error is somewhere else in the code. Loading the state dict before defining the optimizer would not be possible if I’m not wrong, because calling .load_state_dict() on the optimizer needs that it was defined.

Well I meant loading model’s state dict before passing its parameters to the optimizer.

1 Like

This worked for me, thanks! My loading code goes like this:

# Load the network
model = NeuralNetwork().to(device)
# model.load_state_dict(torch.load(save_name))
checkpoint = torch.load(save_name)
model.load_state_dict(checkpoint['state_dict'])
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
optimizer.load_state_dict(checkpoint['optimizer'])
print("Model loaded!")