I have read all the other threads on the subject but I do not get what I am doing wrong. I have just started using pytorch so I am probably doing something stupid. I have a trained VGG19 on CIFAR10 (without the softmax) let us call it net. Then I have my input normalized_input which is simply the first image of the test dataset plus the batch size of one. Now I would like to compute the gradient of the output w.r.t. the input. Right now I am doing this
net.zero_grad()
output = softmax(net(normalized_input), dim=1)
for i in range(10):
output[0, i].backward(retain_graph=True)
print("GRADIENT")
print(normalized_input.grad)
But I am getting None as an output and it is not clear to me why.
You have to make sure normalized_input is wrapped in a Variable with required_grad=True.
Try normalized_input = Variable(normalized_input, requires_grad=True) and check it again.
Usually this flag is set to false, since you don’t need the gradient w.r.t. the input.
I was kindly told that my understanding of how pytorch gradient worked was completely off the mark. Now with some help I solved my problem by using this code
net.zero_grad()
output = net(input)
g = torch.zeros(batch_size, 10, 3, 32, 32)
for i in range(10):
g[:, i] = torch.autograd.grad(output[:, i].sum(), input, retain_graph=True)[0].data
print("GRADIENT: {}")
print(g[0])
It is probably not super helpful but maybe it will help someone in the future. I am not sure that net.zero_grad() is required now since I use grad.
Btw, is there a place where I can read how autograd works in detail, without having to read the source code?
Hi @diravan, thanks for this nice solution! I was trying it just now but encounter one small problem, so I wanted to drop it here and see if you might be able to help out
In my code I encountered the error that says RuntimeError: element 0 of variables tuple is volatile when I tried to execute the line in your code torch.autograd.grad. I didn’t know how to deal with it, but I checked that my output and input Variables do not require gradients, e.g., output.requires_grad = False and input.requires_grad = False. But even after I set those two Variables to require grad, I still get the above error.
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
data.requires_grad = True ### CRUCIAL LINE !!!
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward() # Calculates x.grad = dloss/dx for every x with x.requires_grad=True
–> Afterwards, you can easily access dloss/ddata via data.grad=dloss/ddata