I have very hard times, understanding what tensor.backward(…) does in mathematical terms.
Assuming we have a pre-trained model and doing a forward pass.
model.zero_grad()
y = model(x)
Afterwards, we do a backward step on the output using the ground-truth target.
y.backward(gradient=target)
What is exactly happening here in mathematical terms? What is the gradient argument supposed to be and why is the ground-truth target often used here?
After doing the backward step, what result will I have in x.grad
(mathematically)?
Is x.grad
different from getting the gradient via register_backward_hook
on the first layer?
If yes, what result (mathematically) did I “hook” instead of x.grad
?
def hook_layers(self):
def hook_function(module, grad_in, grad_out):
self.gradients.append(grad_in[0])
# Register hook to the first layer
self.model.conv1.register_backward_hook(hook_function)
I hope this makes sense. If you need clarifications, please feel free to ask!