Using autograd when requires_grad=False

chrome · June 11, 2019, 11:12am

I am trying to implement the GradNorm from https://github.com/hosseinshn/GradNorm/blob/master/GradNormv8.ipynb wherein I am using a pretrained word2vec embedding layer for which I have set requires_grad=False.
On cell 4 of the Notebook there is
G1R = torch.autograd.grad(l1, param[0], retain_graph=True, create_graph=True).
The above line throws the following error
G1R = torch.autograd.grad(l1, param[0], retain_graph=True, create_graph=True) File "/usr/lib64/python3.6/site-packages/torch/autograd/__init__.py", line 145, in grad inputs, allow_unused) RuntimeError: One of the differentiated Tensors does not require grad.
Is there anyway to implement this code for my usecase where pretrained embedding can have requires_grad=False ?

albanD · June 11, 2019, 1:26pm

What is param[0] supposed to be here? The weights of your embedding layer?

chrome · June 12, 2019, 6:40am

# Getting gradients of the first layers of each tower and calculate their l2-norm 
        param = list(MTL.parameters())

where MTL is the class class MTLnet(nn.Module):
I myself am not sure why they’re using param[0]

albanD · June 12, 2019, 11:46am

I guess you want to check what param[0] is and why they use that.
Then make sure that this requires gradients.
Note that if gradients are needed to compute your gradient penalty loss but you don’t want to update your embedding, you can keep the embedding as requiring gradient but don’t give it to the optimizer so that it will never be updated.