Derivative with respect to the input

smu226 · June 29, 2019, 3:22am

Hello! I want to calculate the derivatives (actually Jacobian) of a NN with respect to its input. Usually I do something like this:

torch.autograd.grad(y, x, create_graph=True)[0]

But in this case x doesn’t have the “require grad” property (it is the input to the network so it should be fixed). How can I calculate the derivative in this case? Thank you!

InnovArul · June 29, 2019, 4:43am

you could set the requires_grad property of x to get the gradient w.r.t x?

x.requires_grad = True

smu226 · June 29, 2019, 4:55am

But if I do this, won’t the x itself be change when I backpropagate through the NN?

InnovArul · June 29, 2019, 4:58pm

torch.augograd.grad() just returns the gradient. It doesn’t modify the variables/store grad in .grad, as far as I know. You need to set x.requires_grad to indicate the autograd.grad() function that x is a leaf variable.

smu226 · June 29, 2019, 5:11pm

Yes but I want to add this derivative I am calculating (using autograd.grad()) to the final loss function. Won’t x be modified when I call loss.backward(), because x has the requires_grad parameter associated to it?

InnovArul · June 29, 2019, 5:19pm

I am not sure if I follow correctly. I think, adding this grad to loss function will not have any effect. Don’t you think so?

smu226 · June 29, 2019, 5:34pm

So what I need is to calculate the derivative the the NN with respect to the input, something like: dx = torch.augograd.grad(NN(x),x) and then add this derivative to the loss and backpropagate. For example loss = (NN(x)-x)**2+abs(dx) and then loss.backward(). Won’t the value of x itself be changed if I have x.requires_grad? For example if x represent the pixels of an image, won’t the value of the pixels themselves be affected i.e. the image will change?

InnovArul · June 29, 2019, 6:09pm

Gradient is calculated when there is a computation graph. For example,

x --> linear(w, x) --> softmax(). Here, x, w could be potentially leaf nodes that require gradient.

In this same paradigm, when you add dx to loss function, it is just like you are adding a constant to the loss function. The weights of the NN doesn’t depend on the gradient and this dx doesn’t have any computation graph associated with it. So, I am not sure if x will be affected by this loss function at all.

Yao_Xuan · August 7, 2019, 9:56pm

Hi, I have exactly the same question. Did you figure out whether adding input x with require “grad True” will influence the backward propagation? Thank you very much.

ptrblck · August 9, 2019, 11:30pm

If your input requires gradient, its .grad attribute will be populated after the backward pass, but besides that it won’t change the gradient calculation.

Yao_Xuan · August 13, 2019, 9:48pm

Thank you very much. It is very helpful.

chetan06 · April 25, 2020, 2:41pm

Can this be used for pretrained model present in torchvision.models

Ilya_Kamenshchikov · May 3, 2020, 1:28pm

It seems that you can’t get the gradient through derivative as of pytorch 1.5:

import torch

x = torch.Tensor([1, 2, 3])
x.requires_grad_(True)
c = x ** 2
print(c.grad_fn)  # c has gradient function
dc_dx = torch.autograd.grad(c.mean(), x)[0]
print(dc_dx)
print(dc_dx.grad_fn)  # derivative of c over x doesn't have gradient function
dc_dx.backward()  # raises RuntimeError

penghao_wu · December 1, 2022, 11:28pm

You may try this

dc_dx = torch.autograd.grad(c.mean(), x, create_graph=True, retain_graph=True)[0]

But you have to call the backward on a scalar like torch.norm(dc_dx).