Pre-training problem in pytorch

Nicolo_Savioli · June 14, 2018, 1:46pm

I need to pre-training my model, I have the following characteristics:

The pre-train model is saved in “.pt”
load the new model (to pre-trained) from code.

I tried these two solutions, but neither works:

Solution 1) from Adam Paszke (How to load part of pre trained model?)

model                 = model_in_code_from_autograd().cuda()
pretrain_model  = torch.load("path/../model.pt").cuda() 
model_dict         = model.state_dict()
pretrained_dict  = pretrain_model.state_dict()
# 1 filters 
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
# 2. overwrite entries in the existing state dict
model_dict.update(pretrained_dict) 
# 3. load the new state dict
model.load_state_dict(model_dict)

Solution 2) I used copy_ and I saved in the “state_dict” of the new model

params_p_model     = pretrain_model.named_parameters()  
for name_p, param_p in params_p_model:
          model.state_dict()[name_p].data.copy_(param_p)

The codes do not give errors but the behavior of the network is like a random weights.

Thank you.

citypocket · June 14, 2018, 7:54pm

You can print some of the weight and see if they are the same. (before saving/ after loading)

Did you call model.eval() aftrer loading?

Nicolo_Savioli · June 14, 2018, 8:03pm

hey thanks have a look of re-post After transfer learning model restart from zero