Getting gradient of two losses

binbbaz · February 24, 2022, 11:47am

I am trying to get the gradients of two losses in the following code snippet but all I get is None (AttributeError: ‘NoneType’ object has no attribute ‘data’)

img = img.to(device)
#img.requires_grad = True
input = model(img)
input_prediction = input.max(1, keepdim=True)[1]

btarget = torch.tensor(2).unsqueeze(0).to(device)
x_prime.requires_grad = True
x_prime_output = model(x_prime)
x_prime_pred =  x_prime_output.max(1, keepdim=True)[1]
l_target_loss = F.nll_loss(x_prime_output, btarget)
# model.zero_grad()
l_target_loss.backward(retain_graph = True)
target_grad = l_target_loss.grad.data

l_argmax_loss = F.nll_loss(x_prime_output, input_prediction.squeeze(0))
l_argmax_loss.backward()
l_argmax_grad = l_argmax_loss.grad.data

What am i getting wrong, please?

binbbaz · February 24, 2022, 11:52am

@albanD , could you help sir

albanD · February 24, 2022, 4:56pm

Hey!
In general, you should never user .data.

Also the gradient computed by the backward pass is for the model parameters (generally Tensors you create with requires_grad=True). So here you will get gradients for x_prime for example. Not the loss.

binbbaz · February 24, 2022, 5:01pm

Thank you for your reply. I added l_target_loss.retain_grad() before calling l_target_loss.backward(retain_graph= True) and it worked. I can now get values for target_grad and l_argmax_grad respectively. However, both values are thesame. They are both 1’s. I feel something isn’t right

binbbaz · February 24, 2022, 7:06pm

@albanD , I feel that I’m doing something wrong. What I actually want is the gradient of the target_loss with respect to the input (x) and gradient of the l_argmax_loss with respect to the input (x). The input x is img in the code snippet above. How do I go about it please?

albanD · March 3, 2022, 6:04pm

They are both 1’s.

That is expected as they contain d l_argmax_loss / d l_argmax_loss Which is obviously 1.

gradient of the l_argmax_loss with respect to the input (x)

To get this quantity, you should:

make sure x.requires_grad=True
compute l_argmax_loss in a differentiable manner
call l_argmax_loss.backward()
read the gradient at x.grad