# Calculating the Jacobian matrix after calculating the gradient with autograd

Hi everyone,

I’ve been trying to calculate the jacobian matrix or Jacobian times a vector when the explicit formula for the gradient is not available, and I calculate it by autograd. In this case, the jacobian of PyTorch returns a zero matrix. In the matrix-vector product, it returns the same value as the gradient of the objective function with respect to parameter theta. Please find a toy example below. I appreciate your guidance.

#First version
#Forming the Jacobian matrix
def objective(x,theta):

funval = objective(x,theta)
funval.backward()

x = torch.tensor([1.0,2.0,3.0], requires_grad=True)
theta = torch.tensor([-1.0,-3.0], requires_grad=True)
print(jac)

#Second version
#Forming the Jacobian matrix-vector product with, for example, the vector of ones
def objective(x,theta):

funval = objective(x,theta)
funval.backward()

x = torch.tensor([1.0,2.0,3.0], requires_grad=True)
theta = torch.tensor([-1.0,-3.0], requires_grad=True)
temp.backward()

#which is the same as the gradient of the objective function with respect to theta

The Jacobian of a vector-valued function of a vector argument is
the matrix of partial derivatives of each element of the function’s value
with respect to each element of the argument.

In your case your `objective()` function returns a scalar value, not a
vector, so we normally would not use the term Jacobian. (If you want
to say that your scalar result is a one-dimensional vector, you could
treat your gradient vector as a 1 x n matrix and call it the Jacobian.)

Could you clarify what you are asking here?

Note that the Hessian is the matrix of second-order mixed partial
derivatives of a scalar-valued function of a vector argument. Could
that be what you are asking about?

Best.

K. Frank

Thank you for your reply, K. Frank.

Let me clarify. Let f(x,theta) be our objective. At first, I calculate its gradient with respect to x when theta is constant in the function gradient(x,theta) which is a vector-valued function. Then, I want to calculate the derivative of the function gradient(x,theta) with respect to theta. The result will be the Jacobian matrix. Did it make it clear?

If you have a vector-valued function, `g()` (that happens to be the gradient
of some scalar-valued function, `f()`), then yes, the derivative of `g()` will
be `g()`'s Jacobian.

But that’s an odd way of describing is. You are, of course, computing the
second-order derivatives of `f()`, which is to say, you are computing the
Hessian of `f()`. Calling it the Jacobian (of `g()`) just obscures what it going
on.

Regardless of what you choose to call it, the computation you describe
is that of the mixed second-order partial derivatives of `f()`.

I would suggest that you start with pytorch’s hessian() functional to
compute this.

It is true that `hessian()` will compute the full Hessian of `f()`, whereas
in your description you only want the `x`-`theta` cross terms, so there could
be some inefficiency in that you would compute the block-diagonal terms
that you don’t want.

But I would recommend `hessian()`, only moving on to something more
complicated if it proves inadequate for your needs.

Best.

K. Frank