My learnable vector cannot be updated

111462 · April 22, 2021, 5:35am

I want a learnable vector in ResNet to get
In init function of ResNet, I insert some code.

        weight = torch.rand((5, 1), device='cuda:0')
        self.weight = weight.requires_grad_(True)

And in forward function, I use matrix multiplication between out and self.weight
out is a result of original ResNet. (self._forward_impl)

    def forward(self, x):
        out = self._forward_impl(x)                 #(batch_size * 5, 1)

        out = out.view(-1, 5)                       #(batch_size, 5)
        out = torch.mm(out, self.weight)            #(batch_size, 5)x(5, 1)=(batch_size, 1)

        return out

When I get a loss value using Focal Loss and run loss.backward(), it doesn’t change the self.weight value and sometimes self.weight.grad is None…

print(self.weight)
print(self.weight.grad)

tensor([[0.2026],
        [0.6027],
        [0.2351],
        [0.0260],
        [0.2805]], device='cuda:0', requires_grad=True)
tensor([[-0.1866],
        [-0.2713],
        [-0.2793],
        [-0.1892],
        [-0.1899]], device='cuda:0')
tensor([[0.2026],
        [0.6027],
        [0.2351],
        [0.0260],
        [0.2805]], device='cuda:0', requires_grad=True)
tensor([[-0.2546],
        [-0.3381],
        [-0.3341],
        [-0.2421],
        [-0.2273]], device='cuda:0')

I don’t know why self.weight cannot be updated…

sio277 · April 22, 2021, 6:06am

Wrap your weight with nn.Parameter. After that, the tensor is registered as one of model parameters. Otherwise, it is not visible to model.parameters(), which are optimizer tries to update.

111462 · April 22, 2021, 2:14pm

Thank you! I change my code as you said, now it works well!