Hi,
If you consider a function f that has n_input and n_output. And a Jacobian matrix containing all its partial derivatives J_f of size (n_output x n_input).
Then what backpropagation (or AD whichever way you want to name it) does is to compute v^T J_f
for a given v
.
If your function has a single output, then it makes sense to take v = 1.
so that backprop will return J_f
.
But if you have multiple outputs, there is no good default and so we requirethe user to provide the v
value they want.
If you want to reconstruct the full J_f, you will have to do as many backwards as there are outputs in your function. You can use the autograd.functional.jacobian function if you need that.