Modifying a model's layer weights via state_dict

capoling · November 19, 2019, 4:31am

Hello!

I am trying to zero out some filter weights of a pytorch model before and after training. Once located the correct layers and filters, I go ahead and replace that precise key in the OrderedDictionary that is state_dict with a value of torch.zeros(correct size). By changing the value in the state_dict, am I satisfactorily changing the whole model, making it ready for training with my intended change (in other words, does the change propagate also to model.parameters() or anything that is use in train.py)? If not, what’s the best way of doing so.

Thanks a lot!

ptrblck · November 19, 2019, 5:46am

If you don’t create a deepcopy of the state_dict, it should work:

model = models.resnet18()
sd = model.state_dict()
sd['fc.weight'].zero_()
print(model.fc.weight)
> Parameter containing:
tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]], requires_grad=True)

However, the better approach would be probably to zero out the parameters directly in the model by wrapping it in a with torch.no_grad() statement and manipulating the parameters as you wish.

capoling · November 20, 2019, 7:58am

Thank you for the quick response! If I were to use the with torch.no_grad(), how would I change the model parameters directly as you mentioned? For instance, if I had a tensor where I want to zero out the weights of the filter at index 34 in it, how would I use torch.no_grad() and model.parameters() to do so? Thanks in advance!

ptrblck · November 20, 2019, 7:48pm

This code snippet should work:

model = models.resnet18()
with torch.no_grad():
    model.conv1.weight[34].zero_()
print(model.conv1.weight[33:35])

Maxime_G · November 1, 2020, 12:38pm

modifying by reference doesn’t work for me, this method worked: