I have very hard times, understanding what tensor.backward(…) does in mathematical terms.

Assuming we have a pre-trained model and doing a forward pass.

`model.zero_grad()`

`y = model(x)`

Afterwards, we do a backward step on the output using the ground-truth target.

`y.backward(gradient=target)`

What is exactly happening here in mathematical terms? What is the gradient argument supposed to be and why is the ground-truth target often used here?

After doing the backward step, what result will I have in `x.grad`

(mathematically)?

Is `x.grad`

different from getting the gradient via `register_backward_hook`

on the first layer?

If yes, what result (mathematically) did I “hook” instead of `x.grad`

?

```
def hook_layers(self):
def hook_function(module, grad_in, grad_out):
self.gradients.append(grad_in[0])
# Register hook to the first layer
self.model.conv1.register_backward_hook(hook_function)
```

I hope this makes sense. If you need clarifications, please feel free to ask!