RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn when training from examples

Gor_Matevosyan · January 3, 2021, 11:10pm

Hey everyone.
I’m trying to train this exmaple:


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred

m = TwoLayerNet(1, 10, 2)
m.to(device)

criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(m.parameters(), lr=1e-4)

y = m(torch.Tensor([[1]]).cuda())
loss = criterion(y, torch.Tensor([[1, 1]]).cuda())
optimizer.zero_grad()
loss.backward()

And I got an error

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

ptrblck · January 4, 2021, 3:11am

Your code snippet works fine if I add loss.backward(), so I guess you might be trying to call .backward() on another (detached) tensor?
Could you compare the posted code snippet and update it to reproduce this error, please?

Gor_Matevosyan · January 8, 2021, 5:21pm

I think I got the problem

I have this line of code in my notebook.

tts_model, tts_decoder, _ = torch.hub.load(repo_or_dir='snakers4/silero-models',
                                       model='silero_stt',
                                       language='en', # also available 'de', 'es'
                                       device=device)

When I restart the kernel and run the example provided with this line of code the problem occurs.
To reproduce the error try to put this line before your model initialization.

ptrblck · January 8, 2021, 9:35pm

Does the newly posted line of code raise this error or do you see the failure in the previously posted model, if the silero_stt model is just loaded before?

Gor_Matevosyan · January 12, 2021, 8:12am

the error raises when I call loss.backward() function. But it raises only when silero_stt model is loaded before. I tried to load both GPU and CPU versions.

ptrblck · January 12, 2021, 8:57am

I can reproduce the issue, which is raised by disabling the gradient calculation globally by the silero-models creation in this line of code.
I don’t know, why it’s disabling the gradient calculation, but you can enable it again via:

torch.set_grad_enabled(True)

after loading the model.

Gor_Matevosyan · January 12, 2021, 1:53pm

Worked !! Thank you.
Will ask them why they made it.