Gradient of cholesky and cholesky_inverse

yzxhd · November 4, 2020, 4:08am

I have question about the gradient of torch.cholesky and torch.cholesky_inverse. I use the following example to solve a linear equation Ax=b, where A is symmetric. Then x is used to calculate loss. However, when loss.backward(), I find that A.grad is None. I am wondering what the correct way is to compute the gradient of loss w.r.t A and b.

Is it because the backward gradient is not implemented? Moreover, how to check whether a function in Pytorch has backward gradient?

import torch
A = torch.randn(3, 3, requires_grad=True)
A = torch.mm(A, A.t()) + 1e-05 * torch.eye(3) # make symmetric positive definite
b = torch.randn(3, 1, requires_grad=True)

u = torch.cholesky(A)
Ainv = torch.cholesky_inverse(u)

x = torch.matmul(Ainv, b)
loss = x.mean()
loss.backward()
print(A.grad)

ptrblck · November 4, 2020, 8:27am

In your code snippet you are overwriting the leaf tensor A with a new tensor also called A and thus you should also get a warning when you try to access its .grad attribute:

UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.
  print(A.grad)

You can avoid this by assigning the matmul result to a new tensor:

A = torch.randn(3, 3, requires_grad=True)
A1 = torch.mm(A, A.t()) + 1e-05 * torch.eye(3) # make symmetric positive definite
b = torch.randn(3, 1, requires_grad=True)

u = torch.cholesky(A1)
Ainv = torch.cholesky_inverse(u)

x = torch.matmul(Ainv, b)
loss = x.mean()
loss.backward()
print(A.grad)