Is it correct behavior that tensors with model’s weights in state_dict have requiers_grad value False even though in ‘original’ model the tensor’s requiers_grad value is True?
Does the state_dict contain references to the tensors that in model or in state_dict just copies of the model’s tensors?
Here is code to recreate the case (I use pytorch 1.9.0 with CUDA 11.1 Win 10):
import torch.nn as nn import torch.nn.functional as F # define some model class TheModelClass(nn.Module): def __init__(self): super(TheModelClass, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) def forward(self, x): x = F.relu(self.conv1(x)) return x model = TheModelClass() print(model.conv1.weight.requires_grad) st = model.state_dict() print(st['conv1.weight'].requires_grad)
This code outputs: