nn.parameter(requires_grad=False) displayed in summary as Trainable parameter?

Sebastien_Maraux · November 18, 2024, 11:30am

Hello,

I am building and training a model with a nn parameter set at init with a tensor, which matches a PCA Lowrank output :
self.PCA_matrix_V_Inv = torch.nn.Parameter(PCAMatrix.clone().transpose(0,1), requires_grad=False)

I do mention requires_grad=False, but torch info summary displays that this parameter count as trainable parameters ? I also tried to set required_grad=False on the initial PCAMatrix tensor, but as expected it did not change the behavior (as it should be False by default on Tensor)

As I set the requires_grad at init time of my nn.module model, I don’t get why it is listed as a trainable parameter ?

Any clue about that anyone ?
Thank you

ptrblck · November 18, 2024, 7:46pm

The requires_grad attribute can easily be changed by calling .requires_grad_(True) on the parent module which is why I guess it’s returned as a trainable parameter:

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.param = nn.Parameter(torch.randn(1), requires_grad=False)
        
    def forward(self, x):
        x = x * self.param
        return x


model = MyModel()
print(dict(model.named_parameters()))
# {'param': Parameter containing:
# tensor([1.6849])}

print(model.param.requires_grad)
# False

model.requires_grad_(True)
print(model.param.requires_grad)
# True

If you never want to train it, register the tensor as a buffer.

However, it seems the behavior of torchinfo.summary might also have changed as I cannot reproduce the issue:

model = MyModel()
print(dict(model.named_parameters()))
# {'param': Parameter containing:
# tensor([1.6849])}

print(model.param.requires_grad)
# False

summary(model)
# =================================================================
# Layer (type:depth-idx)                   Param #
# =================================================================
# MyModel                                  (1)
# =================================================================
# Total params: 1
# Trainable params: 0
# Non-trainable params: 1
# =================================================================

model.requires_grad_(True)
print(model.param.requires_grad)
# True

summary(model)
# =================================================================
# Layer (type:depth-idx)                   Param #
# =================================================================
# MyModel                                  1
# =================================================================
# Total params: 1
# Trainable params: 1
# Non-trainable params: 0
# =================================================================

Sebastien_Maraux · November 19, 2024, 8:24am

Thank you for this answer, I will register the PCAMatrix tensor as a buffer, as I don’t expect it to ever be trained. Have a nice day !