How to obtain the derivative of the network output with respect to the input？

Inz-Zos · November 25, 2022, 7:40pm

I want to get the derivative of the network output with respect to the input to see the state of the prediction model. My initial idea was to use the hook function to get the gradient of each layer and then multiply it to get the result. But I don’t know exactly how to do it.

srishti-git1110 · November 25, 2022, 8:03pm

Hi,
You could simply use torch.autograd functionality. There could be a bunch of ways to do the same thing.

I recommend reading up on the official documentation to get hold of the autograd engine. Feel free to post any queries here.

Inz-Zos · November 26, 2022, 8:27am

Thanks, but after I used the autograd function, it only supports derivatives of scalars. I used weighting the output tensor, but it doesn’t seem to work.I would like to ask if there is another way.

for step in range(200):
    pred_y=model(input_x)
    dy_dx=torch.autograd.grad(pred_y,input_x)
    print('dy/dx: ',dy_dx)
    loss=loss_func(pred_y,input_y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

Traceback (most recent call last):
  File "C:\Users\Slive\PycharmProjects\pythonProject\main.py", line 41, in <module>
    dy_dx=torch.autograd.grad(pred_y,input_x)
  File "C:\Users\Slive\anaconda3\envs\Pt\lib\site-packages\torch\autograd\__init__.py", line 150, in grad
    grad_outputs = _make_grads(outputs, grad_outputs)
  File "C:\Users\Slive\anaconda3\envs\Pt\lib\site-packages\torch\autograd\__init__.py", line 34, in _make_grads
    raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs

srishti-git1110 · November 26, 2022, 9:17am

Hi, yes that behaviour is expected in case loss tensor (the one you call backward on) isn’t a scalar (a tensor containing a single element). You could choose to apply some reduction on it like -

loss.sum().backward()

Or,
As is specified in the docs, an additional gradient argument needs to be specified in the backward call on a multidimensional tensor.

gradient is a tensor of matching type and location, and contains the gradient of the differentiated function w.r.t. itself. Mathematically, it’s dLoss/dLoss. So, the following should work -

loss.backward(torch.ones_like(loss))

AlphaBetaGamma96 · November 26, 2022, 11:13am

Hi @Inz-Zos,

If you want an explicit function for your derivative, you can use the functorch library (which comes packaged with the latest PyTorch install). More info can be found in the docs: Per-sample-gradients — functorch 1.13 documentation

An example,

from functorch import make_functional, vmap, grad jacrev

model = Model(*args, **kwargs) #model instance

fnet, params = make_functional(model) #functorch needs a functionalized model

#if your model has multiple outputs use jacrev instead of grad, i.e. jacrev(fnet, argnums=1)
grad_fnet = grad(fnet, argnums=1) #creates expilict function of gradient of y w.r.t x

per_sample_jacobian = vmap(grad_fnet, in_dims=(None, 0))(params, x) #per-sample gradients

Inz-Zos · November 26, 2022, 12:50pm

Thank you so much! Just got through the modifications and can already come up with the results.

Inz-Zos · November 26, 2022, 12:53pm

Thank you, I will continue to learn more about your method.