Say that I want to calculate the output of a function and its Jacobian. One way to do it would be to do the following
x = torch.rand(5, requires_grad=True)
net = torch.nn.Linear(5, 2)
y = net(x)
jac = torch.autograd.functional.jacobian(net, x)
however, this means that I’ll have to do two forward passes. It feels like there should be a way to do the following
jac = torch.autograd.functional.jacobian(lambda x: y, x)
jac is all zeros. Is there a way to do this?
The end goal is to use the Jacobian with respect to the network’s activations in a regularization term, but this is the step I’m bothered about.
I already made a post about the jacobian computation with pytorch (as well as the gradient, divergence and laplacian).
Have a look at it, it might be useful
Thanks, your solution works fine as long as one makes sure that input and output is 1D. I got an order of magnitude speed-up in the Jacobian calculation with your way.
def gradient(y, x, grad_outputs=None):
"""Compute dy/dx @ grad_outputs"""
if grad_outputs is None:
grad_outputs = torch.ones_like(y)
grad = torch.autograd.grad(y, [x], grad_outputs = grad_outputs, create_graph=True)
def jacobian(y, x):
"""Compute dy/dx = dy/dx @ grad_outputs;
for grad_outputs in [1, 0, ..., 0], [0, 1, 0, ..., 0], ...., [0, ..., 0, 1]"""
jac = torch.zeros(y.shape, x.shape)
grad_outputs = torch.zeros_like(y)
for i in range(y.shape):
grad_outputs[i] = 1
jac[i] = gradient(y, x, grad_outputs = grad_outputs)
grad_outputs[i] = 0
x = torch.rand(3, requires_grad=True)
net = torch.nn.Linear(3, 2)
y = net(x)
jac1 = torch.autograd.functional.jacobian(net, x, create_graph=True)
jac2 = jacobian(y, x)
print(torch.allclose(jac1, jac2)) # True