def save_gradient(module, grad_input, grad_output):
print(f"{module.__class__.__name__} input grad:\n{grad_input}")
print(f"{module.__class__.__name__} output grad:\n{grad_output}")
m = nn.LPPool2d(3, 2, stride=2)
m.register_backward_hook(save_gradient)
input = torch.tensor([float(i) for i in range(1, 50)], requires_grad=True).reshape(1,1,7,7)
input = torch.transpose(input, 2, 3)
output = m(input)
output.backward(gradient=output)
I got:
LPPool2d input grad:
(tensor([[[[0.0309, 0.0107, 0.0063],
[0.0248, 0.0097, 0.0059],
[0.0206, 0.0089, 0.0056]]]]),)
LPPool2d output grad:
(tensor([[[[10.7722, 31.1708, 52.9788],
[13.4294, 34.2547, 56.1203],
[16.2184, 37.3533, 59.2653]]]], grad_fn=<PowBackward0>),)
If I don’t misunderstand the parameters of backward hook, here LPPool2d input grad size should be (1, 1, 7, 7), while output grad size should be (1, 1, 3, 3).
My torch version is 1.13.0+cu116
. Actually I don’t know LPPool mechanism very clearly. I just found that the input and output gradient size of AvgPool or MaxPool are different, and I want to know the reason why. I have tried to search this question but I failed to find my answer I expect.
Thanks!