What does gard_input and grad_output mean

shakeel608 · January 17, 2024, 3:20pm

Can anyone please explain with a simple example what is the exact meaning of grad_input and grad_output. I tired to understand from documentations but I couldn’t grasp it

ptrblck · January 17, 2024, 4:26pm

grad_output is the gradient coming from the output of the module during the backward pass while grad_input is the gradient which will be passed to the corresponding input of the module during the backward pass.
Here is a small example using nn.ReLU():

m = nn.ReLU()
m.register_full_backward_hook(lambda module, grad_input, grad_output: print("grad_input: {}\ngrad_output: {}".format(grad_input, grad_output)))

x = torch.randn(2, 2, requires_grad=True)
print(x)
# tensor([[-0.5412, -0.2550],
#         [-1.5957,  0.2068]], requires_grad=True)

out = m(x)
print(out)
# tensor([[0.0000, 0.0000],
#         [0.0000, 0.2068]], grad_fn=<BackwardHookFunctionBackward>)

out.backward(gradient=torch.ones_like(out)*2.)
# grad_input: (tensor([[0., 0.],
#         [0., 2.]]),)
# grad_output: (tensor([[2., 2.],
#         [2., 2.]]),)

print(x.grad)
# tensor([[0., 0.],
#         [0., 2.]])

As you can see grad_output corresponds to the gradient I’m passing to backward while grad_input will be returned and assigned to x as it’s the input of the module.

shakeel608 · January 18, 2024, 1:22pm

Thanks a lot.
This makes sense to me now.
If I need these grad_out from the hooks in the main training loop?
How should I get those after the backward call

ptrblck · January 18, 2024, 1:30pm

You can use backward hooks to access these gradients as seen in my code snippet.