Hello I have the following code (simplified version of my code but able to reproduce the error):

import numpy as np
from numpy import linalg as LA
import torch
import torch.optim as optim
import torch.nn as nn
def func(x,pars):
a = pars[0]
b = pars[1]
c = pars[2]
d = pars[3]
x = x.int()
H = torch.tensor([[a,b,1],[2,3,c],[4,d,7]])
eigenvalues, eigenvectors = np.linalg.eigh(H)
trans_freq = eigenvalues[x]
return torch.tensor(trans_freq)
x_index = torch.tensor([1,2])
y_vals = torch.tensor([0.5,12])
params = torch.tensor([1.,2.,3.,4.])
params.requires_grad=True
opt = optim.SGD([params], lr=100)
mse_loss = nn.MSELoss()
for i in range(10):
opt.zero_grad()
loss = mse_loss(func(x_index,params),y_vals)
print(x_index.requires_grad)
print(params.requires_grad)
print(y_vals.requires_grad)
print(loss.requires_grad)
loss.backward()
opt.step()
print(loss)

The output is:

False
True
False
False

and I am getting this error: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn from this line: loss.backward(). Indeed the loss doesn’t have requires_grad=True but why is that the case (setting it manually in the for loop doesn’t work either). What should I do? Thank you!

You are detaching trans_freq from the computation graph by recreating a new tensor:

Return trans_freq directly and remove the unnecessary tensor creation, which also will raise a warning:

UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).

If I do that I am indeed not getting that error anymore, but now I get this new one: TypeError: 'int' object is not callable, coming from here: loss = mse_loss(func(x_index,params),y_vals). Also I noticed that if I try to call func(x_index,params)-y_vals I am getting this error: TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor', so it seems like, unless I explicitly say return torch.tensor(trans_freq), the output of func(x_index,params) is numpy not pytorch.

You are right and I missed your usage of numpy. This approach won’t directly work as PyTorch won’t track the np.linalg.eigh operation and thus won’t create a computation graph. Either use pure Pytorch methods instead of numpy or create a custom autograd.Function using numpy ops including the backward as described here.

Thanks for this. Why can’t that graph be computed? Matrix diagonalization requires additions and multiplications so taking the derivatives should be straightforward i.e. writing the eigenvalues in terms of the original matrix elements can be done analytically and in principle the derivatives can be taken by hand (I am genuinely curious about what makes the problem difficult computationally). Also is there a way to diagonalize matrices in Pytorch?

PyTorch does not track 3rd party library operations, so either use PyTorch operations or create the custom autograd.Function if you need to use numpy (or any other 3rd party library).

import torch
import torch.optim as optim
import torch.nn as nn
def func(x,pars):
a = pars[0]
b = pars[1]
c = pars[2]
d = pars[3]
x = x.int()
H = torch.tensor([[a,b,1],[2,3,c],[4,d,7]])
eigenvalues, eigenvectors = torch.linalg.eig(H)
trans_freq = torch.real(eigenvalues[x])
return trans_freq
x_index = torch.tensor([1.,2.])
y_vals = torch.tensor([10.,1.1])
params = torch.tensor([1.,2.,3.,4.])
params.requires_grad=True
opt = optim.SGD([params], lr=100)
mse_loss = nn.MSELoss()
for i in range(10):
opt.zero_grad()
loss = mse_loss(func(x_index,params),y_vals)
loss.backward()
opt.step()
print(loss)

Now everything is pytorch, but I am still getting the error: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn for loss.backward().