Hello Ömer!
Just as you use pytorch’s autograd to calculate the derivatives (gradient)
of your loss function with respect to your model’s parameters (and then
use those to update your model with gradient descent), you can use
autograd to calculate the derivatives of a prediction for a single class
with respect to the input to your model. This will be a single column of
your Jacobian matrix.
You can then loop over predictions / columns to build the full Jacobian.
Models work on batches, even if you want to process a single image.
So, for a single image, you need a batch with batch size of one.
Let’s say you have an input
tensor, with shape [1, 784]
. (It could be
[1, 28, 28]
, if that is what your model expects.) You say your model
has ten classes, so you will have:
preds = model (input)
where preds
has shape [nBatch, nClass] = [1, 10]
.
Tell pytorch to track gradients with respect to your input
, apply your
model to input, and call .backward()
on your preds[i]
, looping over
i
. The .grad
property of of your input
tensor will be the i
th column
of the Jacobian.
J = torch.zeros ((1, 784, 10)) # loop will fill in Jacobian
input.requires_grad = True
preds = model (input)
for i in range (10):
grd = torch.zeros ((1, 10)) # same shape as preds
grd[0, i] = 1 # column of Jacobian to compute
preds.backward (gradient = grd, retain_graph = True)
J[:,:,i] = input.grad # fill in one column of Jacobian
input.grad.zero_() # .backward() accumulates gradients, so reset to zero
You could also try pytorch’s experimental jacobian() function, which
I think basically wraps the loop I outlined above, but with more bells
and whistles (but I’ve never used it).
Good luck.
K. Frank