Do I need to use nn.Parameter() again when changing a tensor wrapped in nn.Parameter()?

DawidL · January 10, 2025, 2:50pm

I have the following code (removed some parts for clarity):

class Cell(nn.Module):
    def __init__(self, core_size, conn_size, conn_num):
        super().__init__()
        [...]
        core = torch.Tensor(core_size)
        self.core = nn.Parameter(core)
        nn.init.ones_(self.core)

    def forward(self, x):
        [...]
        self.core = torch.mul(self.core, x)
        [...]

Then I try to run it I get an error related to the operation in forward:

cannot assign ‘torch.FloatTensor’ as parameter ‘core’ (torch.nn.Parameter or None expected)

I’m trying to change the self.core parameter based on the input and keep it stored within the model. Do I need to use torch.nn.Parameter again when adjusting the self.core parameter (e.q. self.core = torch.nn.Parameter(torch.mul(self.core, x)))? Would doing so reset the grad of self.core?

ptrblck · January 10, 2025, 3:45pm

The torch.mul operation is differentiable and is thus creating a non-leaf tensor. If you explicitly want to set this output as the new value of the trainable parameter self.core, instead of computing its gradient and updating it as is the common use case, you could do so in a no_grad context. Something like this should work:

def forward(self, x):
    [...]
    out = torch.mul(self.core, x)
    with torch.no_grad():
        self.core.copy_(out)
    [...]

However, make sure this is really what you want since you are creating a trainable parameter but are not using its gradients and would be updating it manually.

DawidL · January 11, 2025, 7:21am

Thank you for your response! I do believe that is what I want. Just to check, am I correctly assuming that I will still be able to backprop through this to any operations before torch.mul(), but with the old value of self.core being used in gradient calculations?