I have a network with input
X of size of
m x 2 and output
Y of size
m x 1. How would I be able to generate elementwise derivative of output with respect to input? i.e. a matrix of size
dX of size
m x 2
dX_i,1 = dY_i / dX_i,1 and
dX_i,2 = dY_i / dX_i,2 .
There is torch.autograd.jacobian, but note the caveats.
The problem with using jacobian is that it produces too big matrices, and I run into memory problems.
For example if input is
10x2 and output is
10x1, the jacobian is
10x20, and I only need the diagonal elements of the two
10x10 matrices stack to each other.
Then do a for loop over the x2.
I can’t generate the jacobian, because it is too big. Even if I could that’ll be very slow, and inefficient.
Yeah, so I don’t think any way to directly generate just the diagonal unless you have special properties you know about (like "only the
ith input influences the
ith output as we - except for batch norm - have in minibatches).