Create a copy of a model along with loaded weights

jitesh · March 24, 2020, 3:46pm

I have trained a model and have saved weights(saving the state_dict) after training. Now, I want to create 2 copies of the same model with the weights loaded. I am using the following code for the purpose

    PATH_TO_WEIGHTS = 'path/model.pth'
    model = MyModel().cuda() 
    pretrained_weight = torch.load(PATH_TO_WEIGHTS)
    model.load_state_dict(pretrained_weight)
    modelCopy = model

I would like to know if modelCopy would also have the same weights which are loaded from my saved checkpoint or do I have to do something like modelCopy = deepcopy(model) or If I have to again load the weights for modelCopy?

Any help would be appreciated. Thanks!

ptrblck · March 25, 2020, 5:52am

modelCopy is referencing model, so that parameter changes will be reflected in both models.
If you want to use the same state_dict in two independent models, you could use deepcopy or initialize a second model and load the state_dict again.

This code demonstrated the referencing:

# Setup
model = nn.Linear(1, 1)
sd = model.state_dict()

# Load state_dict
modelA = nn.Linear(1, 1)
modelA.load_state_dict(sd)

modelB = modelA

# Check params
for pA, pB in zip(modelA.parameters(), modelB.parameters()):
    print((pA == pB).all())

> tensor(True)
> tensor(True)

# modelB is referencing modelA
with torch.no_grad():
    modelA.weight.fill_(100.)

print(modelA.weight)
> Parameter containing:
tensor([[100.]], requires_grad=True)
print(modelB.weight)
> Parameter containing:
tensor([[100.]], requires_grad=True)

jitesh · March 25, 2020, 5:54am

Thanks a lot. The explanation, especially with the help of the code snippets made it very clear.