''' U is a function takes a vector and return a scalar
G_metric is a function returns a matrix; it's a metric tensor
p ,q are two vectors
D = p.size() # get the dimension of p,q
G = G_metric(q) # get a matrix
detG = torch.Tensor([np.linalg.det(G.data.numpy())]) # get its determinants
invG = torch.Tensor(np.linalg.inv(G.data.numpy())) # get its inverse
return U(q) + 0.5*torch.log((2*math.pi)**D*detG) + 0.5*p@invG@p # compute the loss
Now I want the gradient w.r.t. p and q
Can I just directly do ?
L = Loss(U,G_metric,p,q)
to get the gradient vector? Since I use numpy way to compute inverse and determinant (as I couldn’t find a way to do them in pytorch). It seems to me that they treat invG and detG as const during the backward.
The chain rule when differentiate w.r.t. variable q should go all the way back to function G_metric(q)
x = torch.Tensor()
x = Variable(x,requires_grad=True)
G = torch.eye(2)*3 # compute matrix
invG = torch.inverse(G) # compute its inverse
L = torch.sum(invG) # get the loss
L.backward() # do gradient
Get the error messgae
RuntimeError Traceback (most recent call last)
<ipython-input-26-f2b450ea7f13> in <module>()
----> 1 L.backward()
~/miniconda3/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
91 products. Defaults to ``False``.
---> 93 torch.autograd.backward(self, gradient, retain_graph, create_graph)
95 def register_hook(self, hook):
~/miniconda3/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
88 tensors, grad_tensors, retain_graph, create_graph,
---> 89 allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Is there something wrong with my experiment? Or it’s simply because not implemented as you have pointed earlier?
This function call calculates Cholesky decomposition and in your case, it essentially returns a matrix with determinant equal to square root determinant of the original matrix (because it is a diagonal matrix). I think you wanted to do this:
a = Variable(torch.Tensor([[2,0],[0,4]]))
In the toy example that will work, but it won’t generalize to compute the determinate of any postivie definite matrix, right? Only using the cholesky decomposition as written before (adding the square) would generalize…
It will generalize. For the reason being, the matrices returned from cholesky decomposition are triangular matrices. Their determinant should be square root determinant of the original matrix, as the determinants respect matrix multiplication.