Loss requires_grad is False

smu226 · November 14, 2023, 10:40pm

Hello I have the following code (simplified version of my code but able to reproduce the error):

import numpy as np
from numpy import linalg as LA
import torch
import torch.optim as optim 
import torch.nn as nn

def func(x,pars):
    a = pars[0]
    b = pars[1]
    c = pars[2]
    d = pars[3]

    x = x.int()

    H = torch.tensor([[a,b,1],[2,3,c],[4,d,7]])

    eigenvalues, eigenvectors = np.linalg.eigh(H)

    trans_freq = eigenvalues[x]

    return torch.tensor(trans_freq)

x_index = torch.tensor([1,2])
y_vals = torch.tensor([0.5,12])

params = torch.tensor([1.,2.,3.,4.])
params.requires_grad=True
opt = optim.SGD([params], lr=100)

mse_loss = nn.MSELoss()

for i in range(10):
  opt.zero_grad()
  loss = mse_loss(func(x_index,params),y_vals)
  print(x_index.requires_grad)
  print(params.requires_grad)
  print(y_vals.requires_grad)
  print(loss.requires_grad)
  loss.backward()
  opt.step() 
  print(loss)

The output is:

False
True
False
False

and I am getting this error: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn from this line: loss.backward(). Indeed the loss doesn’t have requires_grad=True but why is that the case (setting it manually in the for loop doesn’t work either). What should I do? Thank you!

ptrblck · November 15, 2023, 4:18am

You are detaching trans_freq from the computation graph by recreating a new tensor:

Return trans_freq directly and remove the unnecessary tensor creation, which also will raise a warning:

UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).

smu226 · November 15, 2023, 7:42am

If I do that I am indeed not getting that error anymore, but now I get this new one: TypeError: 'int' object is not callable, coming from here: loss = mse_loss(func(x_index,params),y_vals). Also I noticed that if I try to call func(x_index,params)-y_vals I am getting this error: TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor', so it seems like, unless I explicitly say return torch.tensor(trans_freq), the output of func(x_index,params) is numpy not pytorch.

ptrblck · November 15, 2023, 3:40pm

You are right and I missed your usage of numpy. This approach won’t directly work as PyTorch won’t track the np.linalg.eigh operation and thus won’t create a computation graph. Either use pure Pytorch methods instead of numpy or create a custom autograd.Function using numpy ops including the backward as described here.

smu226 · November 15, 2023, 5:05pm

Thanks for this. Why can’t that graph be computed? Matrix diagonalization requires additions and multiplications so taking the derivatives should be straightforward i.e. writing the eigenvalues in terms of the original matrix elements can be done analytically and in principle the derivatives can be taken by hand (I am genuinely curious about what makes the problem difficult computationally). Also is there a way to diagonalize matrices in Pytorch?

ptrblck · November 15, 2023, 6:35pm

PyTorch does not track 3rd party library operations, so either use PyTorch operations or create the custom autograd.Function if you need to use numpy (or any other 3rd party library).

smu226 · November 18, 2023, 2:51am

So I changed my code to this:

import torch
import torch.optim as optim 
import torch.nn as nn

def func(x,pars):
    a = pars[0]
    b = pars[1]
    c = pars[2]
    d = pars[3]

    x = x.int()

    H = torch.tensor([[a,b,1],[2,3,c],[4,d,7]])

    eigenvalues, eigenvectors = torch.linalg.eig(H)

    trans_freq = torch.real(eigenvalues[x])

    return trans_freq

x_index = torch.tensor([1.,2.])
y_vals = torch.tensor([10.,1.1])

params = torch.tensor([1.,2.,3.,4.])
params.requires_grad=True
opt = optim.SGD([params], lr=100)

mse_loss = nn.MSELoss()

for i in range(10):
  opt.zero_grad()
  loss = mse_loss(func(x_index,params),y_vals)
  loss.backward()
  opt.step() 
  print(loss)

Now everything is pytorch, but I am still getting the error: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn for loss.backward().

ptrblck · November 18, 2023, 3:58am

You are detaching pars by creating a new tensor as explained in a previous post:

Use torch.cat or torch.stack if you want to create H from pars,