Bug of register_backward_hook

Zichao_Li · December 26, 2019, 2:00am

Hi, during training I use Dataparallel to trian a model and I’d like to get the gradient of the input of the first layer of the model. I use register_backward_hook. But when I modified some unrelated codes, sometimes the gradient can be gotten, sometimes not. I am extremely confused.

mailcorahul · December 26, 2019, 4:40am

Can you post the code where register_backward_hook is used?

MrRobot · December 26, 2019, 6:24am

Could you provide a minimal working example to demonstrate your issues? It is really difficult to help without enough context.

Zichao_Li · December 26, 2019, 7:06am

    def forward_hook(module, input, output):
        layer_inputs.append(input[0])
        layer_outputs.append(output)

    def backward_hook(module, grad_input, grad_output):
        grad_inputs.append(grad_input[0].detach().clone())
        grad_outputs.append(grad_output[0].detach().clone())

    # register_index = 0
    for p in model.modules():
        if isinstance(p, nn.Conv2d):
            p.register_forward_hook(forward_hook)
            p.register_backward_hook(backward_hook)

I just want to get the grad data of every convolution layer in ResNet101
There should be 208 items in the grad list but there are sometimes 207 or 206 or 205.