Need clarification regarding "register_backward_hook" function

I could not find a good documentation on register_backward_hook() function.

I cannot figure out what it receives for the argument grad_in.


    def hook_layers(self):
        def hook_function(module, grad_in, grad_out):
            self.gradients = grad_in[0]
            print("len(grad_in) ", len(grad_in))
            print("grad_in[0].shape ", grad_in[0].shape)
            print("grad_in[1].shape ", grad_in[1].shape)
            print("grad_in[2].shape ", grad_in[2].shape)

        # Register hook to the first layer
        first_layer =list(self.model.features._modules.items())[0][1]
        first_layer.register_backward_hook(hook_function)

For ImageNet for the first layer I get the following print output:

len(grad_in) 3
grad_in[0].shape torch.Size([1, 3, 224, 224])
grad_in[1].shape torch.Size([64, 3, 11, 11])
grad_in[2].shape torch.Size([64])

I do not quite get what those 3 tuple dimensions correspond to.

When I run the same on VGG16 network I get:

len(grad_in) 3
grad_in[0]  None
grad_in[1].shape torch.Size([64, 3, 3, 3])
grad_in[2].shape torch.Size([64])

Here the first tuple item is None.

The way it is used:

 def generate_gradients(self, input_image, target_class):
        model_output = self.model(input_image)
        self.model.zero_grad()
        one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_()
        one_hot_output[0][target_class] = 1
        model_output.backward(gradient=one_hot_output)

Hi,

You can check the doc for it here in particular the warning in there that explains that this is a known limitation (or bug) of the current implementation.

1 Like

oh, yeah I missed the warning and was very puzzled about the behavior.