I have trained a model and have saved weights(saving the state_dict) after training. Now, I want to create 2 copies of the same model with the weights loaded. I am using the following code for the purpose
PATH_TO_WEIGHTS = 'path/model.pth'
model = MyModel().cuda()
pretrained_weight = torch.load(PATH_TO_WEIGHTS)
model.load_state_dict(pretrained_weight)
modelCopy = model
I would like to know if modelCopy would also have the same weights which are loaded from my saved checkpoint or do I have to do something like modelCopy = deepcopy(model) or If I have to again load the weights for modelCopy?
modelCopy is referencing model, so that parameter changes will be reflected in both models.
If you want to use the same state_dict in two independent models, you could use deepcopy or initialize a second model and load the state_dict again.
This code demonstrated the referencing:
# Setup
model = nn.Linear(1, 1)
sd = model.state_dict()
# Load state_dict
modelA = nn.Linear(1, 1)
modelA.load_state_dict(sd)
modelB = modelA
# Check params
for pA, pB in zip(modelA.parameters(), modelB.parameters()):
print((pA == pB).all())
> tensor(True)
> tensor(True)
# modelB is referencing modelA
with torch.no_grad():
modelA.weight.fill_(100.)
print(modelA.weight)
> Parameter containing:
tensor([[100.]], requires_grad=True)
print(modelB.weight)
> Parameter containing:
tensor([[100.]], requires_grad=True)