I am trying to use Pytorch to inspect the values of gradients at each layer of a simple model. I am doing this using a backwards hook at each layer. My function currently prints out the same value for input and output gradients, so clearly I am misunderstanding something. In my hook, why are the values for both input and output the same. My hook function is as follows:
def grad_hook(mod, inp, out):
print("")
print(mod)
print("-" * 10 + ' Incoming Gradients ' + '-' * 10)
print("")
print('Incoming Grad value: {}'.format(inp[0].data))
print("")
print('Upstream Grad value: {}'.format(out[0].data))
And an example output for a linear layer:
Linear(in_features=3, out_features=5, bias=True)
---------- Gradient Values ----------
Incoming Grad value: tensor([-1.3997, -2.1604, 0.8113, -1.0236, 0.3797])
Upstream Grad value: tensor([[-1.3997, -2.1604, 0.8113, -1.0236, 0.3797]])
--------------------------------------