(Newbie) Getting the gradient with respect to the input

diravan · January 23, 2018, 8:37am

Hi all,

I have read all the other threads on the subject but I do not get what I am doing wrong. I have just started using pytorch so I am probably doing something stupid. I have a trained VGG19 on CIFAR10 (without the softmax) let us call it net. Then I have my input normalized_input which is simply the first image of the test dataset plus the batch size of one. Now I would like to compute the gradient of the output w.r.t. the input. Right now I am doing this

net.zero_grad()
output = softmax(net(normalized_input), dim=1)
for i in range(10):
    output[0, i].backward(retain_graph=True)
print("GRADIENT")
print(normalized_input.grad)

But I am getting None as an output and it is not clear to me why.

ptrblck · January 23, 2018, 8:48am

You have to make sure normalized_input is wrapped in a Variable with required_grad=True.
Try normalized_input = Variable(normalized_input, requires_grad=True) and check it again.
Usually this flag is set to false, since you don’t need the gradient w.r.t. the input.

diravan · January 23, 2018, 9:55am

I was kindly told that my understanding of how pytorch gradient worked was completely off the mark. Now with some help I solved my problem by using this code

net.zero_grad()
output = net(input)

g = torch.zeros(batch_size, 10, 3, 32, 32)

for i in range(10):
    g[:, i] = torch.autograd.grad(output[:, i].sum(), input, retain_graph=True)[0].data

print("GRADIENT: {}")
print(g[0])

It is probably not super helpful but maybe it will help someone in the future. I am not sure that net.zero_grad() is required now since I use grad.

Btw, is there a place where I can read how autograd works in detail, without having to read the source code?

ptrblck · January 23, 2018, 10:02am

You can find some documents:

diravan · January 23, 2018, 12:41pm

Thanks, I did not know the NIPS paper.

moonlightlane · April 27, 2018, 2:48am

Hi @diravan, thanks for this nice solution! I was trying it just now but encounter one small problem, so I wanted to drop it here and see if you might be able to help out

In my code I encountered the error that says RuntimeError: element 0 of variables tuple is volatile when I tried to execute the line in your code torch.autograd.grad. I didn’t know how to deal with it, but I checked that my output and input Variables do not require gradients, e.g., output.requires_grad = False and input.requires_grad = False. But even after I set those two Variables to require grad, I still get the above error.

Thanks in advance!

akhilpan · December 13, 2018, 10:11am

I am trying this out now. It works for batch_size= 1 but for batch_size > 1 I get the following error:

RuntimeError: grad can be implicitly created only for scalar outputs

Is there any efficient way to use the autograd for batch_size more than 1?

gebbissimo · August 26, 2020, 3:06pm

Since this thread is still the first google hit when searching for “pytorch calculate gradient for input”, a small update:

Variables have been deprecated for a while now: https://pytorch.org/docs/stable/autograd.html#variable-deprecated

In order to calculate the gradient of the loss with respect to the input data (tensor), that is calculate dloss/ddata, simply do… (based on https://github.com/pytorch/examples/blob/master/mnist/main.py)

    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        data.requires_grad = True  ### CRUCIAL LINE !!!
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward() # Calculates x.grad = dloss/dx for every x with x.requires_grad=True

–> Afterwards, you can easily access dloss/ddata via data.grad=dloss/ddata