Compute gradient of determinant function or inverse function

ElleryL · August 26, 2018, 2:46am

I have a loss function defined like this

def Loss(U,G_metric,p,q):
   ''' U is a function takes a vector and return a scalar
       G_metric is a function returns a matrix; it's a metric tensor 
       p ,q are two vectors
   '''
    D = p.size()[0] # get the dimension of p,q
    G = G_metric(q) # get a matrix 

    detG = torch.Tensor([np.linalg.det(G.data.numpy())]) # get its determinants
    invG = torch.Tensor(np.linalg.inv(G.data.numpy())) # get its inverse

    return U(q) + 0.5*torch.log((2*math.pi)**D*detG) + 0.5*p@invG@p # compute the loss

Now I want the gradient w.r.t. p and q

Can I just directly do ?


L = Loss(U,G_metric,p,q)
L.backward()
q.grad
p.grad

to get the gradient vector? Since I use numpy way to compute inverse and determinant (as I couldn’t find a way to do them in pytorch). It seems to me that they treat invG and detG as const during the backward.

The chain rule when differentiate w.r.t. variable q should go all the way back to function G_metric(q)

Thanks

Ranahanocka · August 26, 2018, 3:08am

Calling Variable on G.data resets the loss tree. I think pytorch might throw an error if you tried that loss.

Maybe you can try the cholesky decomposition as mentioned in this post:

ElleryL · August 26, 2018, 3:10am

Thanks for the reply; how do I deal with the inverse of metric?

Ranahanocka · August 26, 2018, 3:15am

I’m not sure. According to this it’s not available:

Yet this PR looks like it might do it.

Did you try torch.inverse?

ElleryL · August 26, 2018, 3:20am

Thanks; So I think for differentiate through invG, using invG = torch.inverse(G); then calling Loss.backward() will differentiate through ? Is this correct way of doing it?

Ranahanocka · August 26, 2018, 3:21am

Yes (assuming that it is implemented for Var, which it looks like it should be. Otherwise will throw an error)

ElleryL · August 26, 2018, 3:29am

Hey

I did some toy experiment


x = torch.Tensor([3])
x = Variable(x,requires_grad=True)
G = torch.eye(2)*3 # compute matrix
invG = torch.inverse(G) # compute its inverse
L = torch.sum(invG) # get the loss
L.backward() # do gradient

Get the error messgae

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-f2b450ea7f13> in <module>()
----> 1 L.backward()

~/miniconda3/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
     91                 products. Defaults to ``False``.
     92         """
---> 93         torch.autograd.backward(self, gradient, retain_graph, create_graph)
     94 
     95     def register_hook(self, hook):

~/miniconda3/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     87     Variable._execution_engine.run_backward(
     88         tensors, grad_tensors, retain_graph, create_graph,
---> 89         allow_unreachable=True)  # allow_unreachable flag
     90 
     91 

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Is there something wrong with my experiment? Or it’s simply because not implemented as you have pointed earlier?

Ranahanocka · August 26, 2018, 3:35am

You need to define an an optimizer and pass it some variables to optimize

Ranahanocka · August 26, 2018, 3:51am

BTW - I checked the pull request diff and it looks like the backward for inverse was implemented. You can verify by just writing:

x=Variable(x,requires_grad=True)
torch.inverse(x)

If it didn’t work, inverse would throw an error on type Var.

ElleryL · August 26, 2018, 4:32am

I’m not sure if I misunderstood the post How to calculate the determinant of a variable?

But it seems to me it throw error when the matrix is not positive definite (which is Okey in my case as I requires the metric tensor to be positive definite). However, when I use

a = Variable(torch.Tensor([[2,0],[0,4]]))
torch.potrf(a).diag().prod()

The result is tensor(2.8284)

But I should expect it to be 8 as its determinant is 8.

InnovArul · August 26, 2018, 4:42am

Here G does not have requires_grad=True. I think, thats the reason for the error.

Probably, you wanted to do this:

x = torch.Tensor([3])
x = Variable(x,requires_grad=True)
G = torch.eye(2) * x # compute matrix

Ranahanocka · August 26, 2018, 4:45am

It looks like it needs to be squared

InnovArul · August 26, 2018, 4:48am

This function call calculates Cholesky decomposition and in your case, it essentially returns a matrix with determinant equal to square root determinant of the original matrix (because it is a diagonal matrix). I think you wanted to do this:

a = Variable(torch.Tensor([[2,0],[0,4]]))
torch.diag(a).prod()

Ranahanocka · August 26, 2018, 4:58am

In the toy example that will work, but it won’t generalize to compute the determinate of any postivie definite matrix, right? Only using the cholesky decomposition as written before (adding the square) would generalize…

InnovArul · August 26, 2018, 5:03am

It will generalize. For the reason being, the matrices returned from cholesky decomposition are triangular matrices. Their determinant should be square root determinant of the original matrix, as the determinants respect matrix multiplication.

>>> x = torch.tensor([[1,2,3],[0,4,5],[0,0,6]]).float()

>>> x.det()
tensor(24.)

>>> x.transpose(0,1).det()
tensor(24.)

>>> torch.det(torch.mul(x, x.transpose(0,1)))
tensor(576.)

I am sorry for the previous answer. I got carried away I guess.