Requiers_grad is list in state_dict?

Roman_Girin · July 4, 2021, 3:57pm

Is it correct behavior that tensors with model’s weights in state_dict have requiers_grad value False even though in ‘original’ model the tensor’s requiers_grad value is True?
Does the state_dict contain references to the tensors that in model or in state_dict just copies of the model’s tensors?

Here is code to recreate the case (I use pytorch 1.9.0 with CUDA 11.1 Win 10):

import torch.nn as nn
import torch.nn.functional as F

# define some model
class TheModelClass(nn.Module):
    def __init__(self):
        super(TheModelClass, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return x

model = TheModelClass()
print(model.conv1.weight.requires_grad)
st = model.state_dict()
print(st['conv1.weight'].requires_grad)

This code outputs:
True
False

ptrblck · July 4, 2021, 11:22pm

Yes, this is expected, as the model defines, if the parameters are frozen or not, while the state_dict stores their values.
You can thus create a new model instance, freeze/unfreeze parameters as you want, and load the state_dict afterwards.