def Loss(U,G_metric,p,q):
''' U is a function takes a vector and return a scalar
G_metric is a function returns a matrix; it's a metric tensor
p ,q are two vectors
'''
D = p.size()[0] # get the dimension of p,q
G = G_metric(q) # get a matrix
detG = torch.Tensor([np.linalg.det(G.data.numpy())]) # get its determinants
invG = torch.Tensor(np.linalg.inv(G.data.numpy())) # get its inverse
return U(q) + 0.5*torch.log((2*math.pi)**D*detG) + 0.5*p@invG@p # compute the loss
Now I want the gradient w.r.t. p and q
Can I just directly do ?
L = Loss(U,G_metric,p,q)
L.backward()
q.grad
p.grad
to get the gradient vector? Since I use numpy way to compute inverse and determinant (as I couldnāt find a way to do them in pytorch). It seems to me that they treat invG and detG as const during the backward.
The chain rule when differentiate w.r.t. variable q should go all the way back to function G_metric(q)
Thanks; So I think for differentiate through invG, using invG = torch.inverse(G); then calling Loss.backward() will differentiate through ? Is this correct way of doing it?
x = torch.Tensor([3])
x = Variable(x,requires_grad=True)
G = torch.eye(2)*3 # compute matrix
invG = torch.inverse(G) # compute its inverse
L = torch.sum(invG) # get the loss
L.backward() # do gradient
Get the error messgae
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-26-f2b450ea7f13> in <module>()
----> 1 L.backward()
~/miniconda3/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
91 products. Defaults to ``False``.
92 """
---> 93 torch.autograd.backward(self, gradient, retain_graph, create_graph)
94
95 def register_hook(self, hook):
~/miniconda3/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
87 Variable._execution_engine.run_backward(
88 tensors, grad_tensors, retain_graph, create_graph,
---> 89 allow_unreachable=True) # allow_unreachable flag
90
91
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Is there something wrong with my experiment? Or itās simply because not implemented as you have pointed earlier?
But it seems to me it throw error when the matrix is not positive definite (which is Okey in my case as I requires the metric tensor to be positive definite). However, when I use
a = Variable(torch.Tensor([[2,0],[0,4]]))
torch.potrf(a).diag().prod()
The result is tensor(2.8284)
But I should expect it to be 8 as its determinant is 8.
This function call calculates Cholesky decomposition and in your case, it essentially returns a matrix with determinant equal to square root determinant of the original matrix (because it is a diagonal matrix). I think you wanted to do this:
a = Variable(torch.Tensor([[2,0],[0,4]]))
torch.diag(a).prod()
In the toy example that will work, but it wonāt generalize to compute the determinate of any postivie definite matrix, right? Only using the cholesky decomposition as written before (adding the square) would generalizeā¦
It will generalize. For the reason being, the matrices returned from cholesky decomposition are triangular matrices. Their determinant should be square root determinant of the original matrix, as the determinants respect matrix multiplication.