I want to get the derivative of the network output with respect to the input to see the state of the prediction model. My initial idea was to use the hook function to get the gradient of each layer and then multiply it to get the result. But I don’t know exactly how to do it.
You could simply use torch.autograd functionality. There could be a bunch of ways to do the same thing.
I recommend reading up on the official documentation to get hold of the autograd engine. Feel free to post any queries here.
Thanks, but after I used the autograd function, it only supports derivatives of scalars. I used weighting the output tensor, but it doesn’t seem to work.I would like to ask if there is another way.
for step in range(200): pred_y=model(input_x) dy_dx=torch.autograd.grad(pred_y,input_x) print('dy/dx: ',dy_dx) loss=loss_func(pred_y,input_y) optimizer.zero_grad() loss.backward() optimizer.step()
Traceback (most recent call last): File "C:\Users\Slive\PycharmProjects\pythonProject\main.py", line 41, in <module> dy_dx=torch.autograd.grad(pred_y,input_x) File "C:\Users\Slive\anaconda3\envs\Pt\lib\site-packages\torch\autograd\__init__.py", line 150, in grad grad_outputs = _make_grads(outputs, grad_outputs) File "C:\Users\Slive\anaconda3\envs\Pt\lib\site-packages\torch\autograd\__init__.py", line 34, in _make_grads raise RuntimeError("grad can be implicitly created only for scalar outputs") RuntimeError: grad can be implicitly created only for scalar outputs
Hi, yes that behaviour is expected in case
loss tensor (the one you call backward on) isn’t a scalar (a tensor containing a single element). You could choose to apply some reduction on it like -
As is specified in the docs, an additional
gradient argument needs to be specified in the backward call on a multidimensional tensor.
gradient is a tensor of matching type and location, and contains the gradient of the differentiated function w.r.t. itself. Mathematically, it’s
dLoss/dLoss. So, the following should work -
If you want an explicit function for your derivative, you can use the functorch library (which comes packaged with the latest PyTorch install). More info can be found in the docs: Per-sample-gradients — functorch 1.13 documentation
from functorch import make_functional, vmap, grad jacrev model = Model(*args, **kwargs) #model instance fnet, params = make_functional(model) #functorch needs a functionalized model #if your model has multiple outputs use jacrev instead of grad, i.e. jacrev(fnet, argnums=1) grad_fnet = grad(fnet, argnums=1) #creates expilict function of gradient of y w.r.t x per_sample_jacobian = vmap(grad_fnet, in_dims=(None, 0))(params, x) #per-sample gradients
Thank you so much! Just got through the modifications and can already come up with the results.
Thank you, I will continue to learn more about your method.