Updating model parameters with a new loss function

hi everyone, I’m trying to update the parameters of a pretrained masked language model with a new custom loss function, but it seems that the gradient of parameters are none. I used a pretrained bert for masked language model and a pretrained classifier. I have concatenated a vector to the batch and made a new batch, when I want to optimize parameters of language model, I see that the gradients of the parameters are none so it can not update the weights. but without concatenation everything seems fine. Can you please advice what should I do with this issue?

loss = loss_function(scores, true_labels)
loss.backward()
for param in maskedlm_.parameters():
    print(param.grad)
update_weights(maskedlm)
optimizer.step()

My guess would be that you are creating a non-leaf tensor with the concatenation.
Could you check the is_leaf attribute of the parameters before and after the concatenation?
If it’s showing False afterwards (and True before), re-wrap the new tensor in an nn.Parameter.

1 Like

Thanks for your suggestion, I checked is_leaf attribute and all of them were True, but the gradients of parameters are None, do you have any other suggestion? maybe there is a problem with my pretrained classifier or loss function?
by the way I used this code snippet for checking the leaf

for parameter in maskedlm_.parameters():
     print(parameter.data.is_leaf)

Your code will explicitly return the parameters only, which all should be leaves. You should check the is_leaf attribute of the parameters before and after concatenation manually.

a = nn.Parameter(torch.randn(1))
b = nn.Parameter(torch.randn(1))

print(a.is_leaf)
> True
print(b.is_leaf)
> True

c = torch.cat((a, b), dim=0)
print(c.is_leaf)
> False
1 Like

Thank you, your suggestion was so helpful and I found the problem at last :slight_smile: