Compute gradient of determinant function or inverse function

I have a loss function defined like this

def Loss(U,G_metric,p,q):
   ''' U is a function takes a vector and return a scalar
       G_metric is a function returns a matrix; it's a metric tensor 
       p ,q are two vectors
   '''
    D = p.size()[0] # get the dimension of p,q
    G = G_metric(q) # get a matrix 

    detG = torch.Tensor([np.linalg.det(G.data.numpy())]) # get its determinants
    invG = torch.Tensor(np.linalg.inv(G.data.numpy())) # get its inverse

    return U(q) + 0.5*torch.log((2*math.pi)**D*detG) + 0.5*p@invG@p # compute the loss

Now I want the gradient w.r.t. p and q

Can I just directly do ?


L = Loss(U,G_metric,p,q)
L.backward()
q.grad
p.grad

to get the gradient vector? Since I use numpy way to compute inverse and determinant (as I couldnā€™t find a way to do them in pytorch). It seems to me that they treat invG and detG as const during the backward.

The chain rule when differentiate w.r.t. variable q should go all the way back to function G_metric(q)

Thanks

Calling Variable on G.data resets the loss tree. I think pytorch might throw an error if you tried that loss.

Maybe you can try the cholesky decomposition as mentioned in this post:

Thanks for the reply; how do I deal with the inverse of metric?

Iā€™m not sure. According to this itā€™s not available:

Yet this PR looks like it might do it.

Did you try torch.inverse?

Thanks; So I think for differentiate through invG, using invG = torch.inverse(G); then calling Loss.backward() will differentiate through ? Is this correct way of doing it?

Yes (assuming that it is implemented for Var, which it looks like it should be. Otherwise will throw an error)

Hey

I did some toy experiment


x = torch.Tensor([3])
x = Variable(x,requires_grad=True)
G = torch.eye(2)*3 # compute matrix
invG = torch.inverse(G) # compute its inverse
L = torch.sum(invG) # get the loss
L.backward() # do gradient

Get the error messgae

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-f2b450ea7f13> in <module>()
----> 1 L.backward()

~/miniconda3/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
     91                 products. Defaults to ``False``.
     92         """
---> 93         torch.autograd.backward(self, gradient, retain_graph, create_graph)
     94 
     95     def register_hook(self, hook):

~/miniconda3/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     87     Variable._execution_engine.run_backward(
     88         tensors, grad_tensors, retain_graph, create_graph,
---> 89         allow_unreachable=True)  # allow_unreachable flag
     90 
     91 

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Is there something wrong with my experiment? Or itā€™s simply because not implemented as you have pointed earlier?

You need to define an an optimizer and pass it some variables to optimize

BTW - I checked the pull request diff and it looks like the backward for inverse was implemented. You can verify by just writing:

x=Variable(x,requires_grad=True)
torch.inverse(x)

If it didnā€™t work, inverse would throw an error on type Var.

Iā€™m not sure if I misunderstood the post How to calculate the determinant of a variable?

But it seems to me it throw error when the matrix is not positive definite (which is Okey in my case as I requires the metric tensor to be positive definite). However, when I use

a = Variable(torch.Tensor([[2,0],[0,4]]))
torch.potrf(a).diag().prod()

The result is tensor(2.8284)

But I should expect it to be 8 as its determinant is 8.

Here G does not have requires_grad=True. I think, thats the reason for the error.

Probably, you wanted to do this:

x = torch.Tensor([3])
x = Variable(x,requires_grad=True)
G = torch.eye(2) * x # compute matrix

It looks like it needs to be squared

This function call calculates Cholesky decomposition and in your case, it essentially returns a matrix with determinant equal to square root determinant of the original matrix (because it is a diagonal matrix). I think you wanted to do this:

a = Variable(torch.Tensor([[2,0],[0,4]]))
torch.diag(a).prod()

In the toy example that will work, but it wonā€™t generalize to compute the determinate of any postivie definite matrix, right? Only using the cholesky decomposition as written before (adding the square) would generalizeā€¦

It will generalize. For the reason being, the matrices returned from cholesky decomposition are triangular matrices. Their determinant should be square root determinant of the original matrix, as the determinants respect matrix multiplication.

>>> x = torch.tensor([[1,2,3],[0,4,5],[0,0,6]]).float()

>>> x.det()
tensor(24.)

>>> x.transpose(0,1).det()
tensor(24.)

>>> torch.det(torch.mul(x, x.transpose(0,1)))
tensor(576.)

I am sorry for the previous answer. I got carried away I guess.

1 Like