How did you check the gradient shape? Could you post the code snippet you were using?
If you want to create gradients for the input, you would have to set the requires_grad attribute of the input tensor to True and could check the gradient via print(input.grad) after the backward call.
The gradient would have the same shape as the input in this case.
def generate_gradients(self, input_image, target_class):
import pdb; pdb.set_trace()
input_image = input_image.cuda()
input_image.requires_grad = True
model_output = self.model(input_image)
# Zero gradients
# Target for backprop
one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_()
one_hot_output[target_class] = 1
one_hot_output = one_hot_output.cuda()
# Backward pass
model_output.backward(gradient=one_hot_output, retain_graph = True)
# Convert Pytorch variable to numpy array
#  to get rid of the first channel (1,3,224,224)
gradients_as_arr = self.gradients.data.detach().cpu().numpy()
Yes, I am using hooks to create it. When I try to set requires_grad = True, I get RuntimeError: you can only change requires_grad flags of leaf variables. The code is a bit large to put here so I am not able to put a completely reproducible example. My intention is to do guided back-propagation, i believe for this the gradient needs to flow all the back to the inputs so I can get the gradient in the shape of the input (correct me if I am wrong), do I need to set required_grad = True at the time of training then?
The error message would be raised, if input_image is not a leaf variable, i.e. if it was created by another operation. If that’s the case, it might already require gradients and you should be able to access them via imput_image.grad.
Since the current self.gradients shape is different than the input_image.shape, it seems you are registering the hook on a different parameter or layer.
I tried some little more debugging, now input_image.grad returns a [1, 3, 224, 224] tensor as expected and does not throw an error about the input not having grads, however self.gradients.data is still [1, 64, 224, 224].
This is how I am hooking into the layers, as you can see the hook is at the first layer itself, so ideally speaking the shape flowing into the first layer should be equal to [1, 3, 224, 224] and not [1, 64, 224, 224] right?
Got it, but will register_backward_hook affect the grad_in of the module? I am correctly placing the hook in the first layer where it should be, the gradient flowing in should be the shape of the image [1, 3, 224, 224] but both grad_in and grad_out are [1, 64, 224, 224] any tips for this?