I need to pre-training my model, I have the following characteristics:
- The pre-train model is saved in “.pt”
- load the new model (to pre-trained) from code.
I tried these two solutions, but neither works:
Solution 1) from Adam Paszke (How to load part of pre trained model?)
model = model_in_code_from_autograd().cuda()
pretrain_model = torch.load("path/../model.pt").cuda()
model_dict = model.state_dict()
pretrained_dict = pretrain_model.state_dict()
# 1 filters
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
# 2. overwrite entries in the existing state dict
model_dict.update(pretrained_dict)
# 3. load the new state dict
model.load_state_dict(model_dict)
Solution 2) I used copy_ and I saved in the “state_dict” of the new model
params_p_model = pretrain_model.named_parameters()
for name_p, param_p in params_p_model:
model.state_dict()[name_p].data.copy_(param_p)
The codes do not give errors but the behavior of the network is like a random weights.
Thank you.