Setting labels requires_grad=True causes problem

I am a beginner to PyTorch. When I am building a toy neural network for regression, I mistakenly set the requires_grad of label y to True (since I do not have when computing loss), and this makes the network diverge where the loss grows larger than larger and finally becomes nan.

Previously, when I tried to visualize the tensor with matplotlib, I could not convert tensors to np.array() when requires_grad=True, I am wondering if this happens based on the similar reason.

The following is the code.

import torch

x = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)
y = x.pow(2) + 10 * torch.rand(x.size())


x.requires_grad = True
y.requires_grad = True

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        super(Net, self).__init__()
        self.hidden = torch.nn.Linear(n_feature, n_hidden)
        self.predict = torch.nn.Linear(n_hidden, n_output)
    
    def forward(self, x):
        x = torch.relu(self.hidden(x))
        x = self.predict(x)
        return x

net = Net(1, 10, 1)
optimizer = torch.optim.SGD(net.parameters(), lr=0.5)
criterion = torch.nn.MSELoss()


for t in range(200):
    y_pred = net(x)

    loss= criterion(y_pred, y)

    optimizer.zero_grad()
    loss.backward()
    print("Epoch {}: {}".format(t, loss))
    optimizer.step()

Iā€™m not sure if I understand your question.

If you are trying to predict y using x with a function, why do you need to set y to require gradient? Your code should just work by commenting out the following line:

y.requires_grad = True

Alternatively, if you have to set y to require gradient, you can change the line which calculates loss to:

loss= criterion(y_pred, y.requires_grad_(False))

This should also work.

1 Like