Derivative of model outputs w.r.t input features

Hi all,

Assume that we have a pertained NN model like LeNet-5 which successfully predicts handwritten digits. In this case number of features is 784 (assuming 28x28 input images) and number of outputs is 10. Sum of the output values(probs) adds up to 1 and each output shows the probability of that class for the given input image.

Now, assuming that I already have this model. And I can predict the class of any input image using feedforward model prediction.

My question is: For any test input image, I would like to calculate derivative of any model output w.r.t model input features. This way I will be calculating a 784x10 Jacobian matrix J.

For example J[0,0] is the derivative of output 0 w.r.t input feature 0 and J[2,3] is derivative of output 3 w.r.t input feature 2 and so on…

Actually maybe I can calculate each element of this Jacobian via a messy code block but I wonder if there is any easy and elegant way of doing so.

Any comment on how to calculate the J matrix?

Hello Ă–mer!

Just as you use pytorch’s autograd to calculate the derivatives (gradient)
of your loss function with respect to your model’s parameters (and then
use those to update your model with gradient descent), you can use
autograd to calculate the derivatives of a prediction for a single class
with respect to the input to your model. This will be a single column of
your Jacobian matrix.

You can then loop over predictions / columns to build the full Jacobian.

Models work on batches, even if you want to process a single image.
So, for a single image, you need a batch with batch size of one.

Let’s say you have an input tensor, with shape [1, 784]. (It could be
[1, 28, 28], if that is what your model expects.) You say your model
has ten classes, so you will have:

preds = model (input)

where preds has shape [nBatch, nClass] = [1, 10].

Tell pytorch to track gradients with respect to your input, apply your
model to input, and call .backward() on your preds[i], looping over
i. The .grad property of of your input tensor will be the ith column
of the Jacobian.

J = torch.zeros ((1, 784, 10))   # loop will fill in Jacobian
input.requires_grad = True
preds = model (input)
for  i in range (10):
    grd = torch.zeros ((1, 10))   # same shape as preds
    grd[0, i] = 1    # column of Jacobian to compute
    preds.backward (gradient = grd, retain_graph = True)
    J[:,:,i] = input.grad   # fill in one column of Jacobian
    input.grad.zero_()   # .backward() accumulates gradients, so reset to zero

You could also try pytorch’s experimental jacobian() function, which
I think basically wraps the loop I outlined above, but with more bells
and whistles (but I’ve never used it).

Good luck.

K. Frank

2 Likes

Hi Frank,

I just did a very minor change and it seems to work. Thanks for your help! Appreciated…

my_input=test_data[num][0].view(1,1,28,28)

J = torch.zeros((784, 10)) # loop will fill in Jacobian
J = J.float()

my_input.requires_grad_()

preds = model(my_input)
print("preds shape is: ",preds.shape)

for i in range (10):
grd = torch.zeros ((1, 10)) # same shape as preds
grd[0, i] = 1 # column of Jacobian to compute
preds.backward(gradient = grd, retain_graph = True)
J[:,i] = my_input.grad.view(784).float() # fill in one column of Jacobian
my_input.grad.zero_() # .backward() accumulates gradients, so reset to zero

print(J.shape)
print(J)

Thanks! Really helpful.