RuntimeError: element 0 of variables does not require grad and does not have a grad_fn

hi, i have a problem here, i got a sequence of Variables which are the outputs of the bi-directional RNN, and i stacked them into a matrix xs_h whose dimension is (seq_length, batch_size, hidden_size), them i want to update the matrix xs_h by convoluting on two slices in xs_h, some codes are as follows:

new_xs_h = xs_h.clone()
vp, vc = xs_h[idx_0, bidx], xs_h[idx_1, bidx]
x = tc.stack([self.f1(vp), self.f2(vc)], dim=1)[None, :, :]
new_xs_h[idx_1, bidx] = self.tanh(self.l_f2(self.conv(x).squeeze()))

actually, i want to update the Variable xs_h and then let the new updated matrix new_xs_h get into my computation graph again. However, i got following errors when i call backward() after the running of above code:

RuntimeError: element 0 of variables does not require grad and does not have a grad_fn

i do not kown why, any reply will be appreciated.
thanks.

6 Likes

It sounds like the problem is that your xs_h don’t have requires_grad=True. Have you tried creating Variables with requires_grad=True?

9 Likes

thanks for reply, the Variable xs_h is not created by myself, it is the output of the Bi-RNN by feeding the embedding of the words. so the requires_grad attribute is False.

1 Like

Okay. You can make a new Variable with requires_grad = True:

var_xs_h = Variable(xs_h.data, requires_grad=True)
7 Likes

Did the suggestion solve your problem? I have the same error thrown at me but the error isn’t very helpful since I don’t know what of my code has the require_gradients set to false. I went ahead and set everything I could find to trainable but it still didn’t fix it…

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
3 Likes

For me ‘loss = Variable(loss, requires_grad = True)’ worked.

I was trying to use float type of loss, and it was giving me the same error.

24 Likes

Got the same error in a very different problem and gt_tugsuu 's post helped. thanks

2 Likes

Thank you for your answer

Hi richard, I was also getting same error using this it got fixed but now loss is not decreasing with every epochs . It is constant. I think this is related to this variable.

4 Likes

Could you give the link of gt_tugsuu’s post? I meet the problem of getting the same output and loss. Thanks

Seems my problem is not caused by the same error, so I do not need the post. Thanks!

@sonal_garg and other people that may be tempted to use loss.requires_grad=True or some similar fix, note that if you are even able to set the requires_grad of your loss, something is very wrong.

You can only set requires_grad of leaf nodes of the computation graph, that is, tensors that do not further propagate the gradient (see this explanation). If your loss does not propagate the gradient to the rest of your model, you are not training the model, and the loss is useless (even if you force it to require a gradient).

Chances are, some layer is set as requires_grad=False somewhere in the code (probably all of them or the last layer), or the computation graph is detached. Possibly the backward call is wrapped in a with torch.no_grad().

18 Likes

I have the same error for different code. @richard @yGalindo , how does one generally locate the source of the error? I’ve already checked that my loss has requires_grad=True and that the model’s input to the loss function has requires_grad=True.

1 Like

I found my problem. For anyone else after me, I had a loop making multiple passes through the model and adding each pass’s loss to a running total. But I initialized the total loss to 0 outside the loop and that tensor had requires_grad=False, a property that was preserved as each pass’s loss was added to the total loss. I solved this using total_reward = torch.zeros(1, requires_grad=True).

1 Like

Thank you for sharing this!
My problem was resolved right away!
loss.requires_grad = True (edited the typo)

2 Likes

Hi Everyone i am getting RuntimeError: element 11 of tensors does not require grad and does not have a grad_fn. when I am running a GAN Architecture for getting the Gaussian Model and I am using WGANLOSS as well.

How to fix the error so that i can get th required output.

The error is usually thrown, if you detach a tensor from the computation graph, e.g. by:

  • calling tensor.detach() on it directly
  • transforming it to a numpy array or a Python literal
  • recreating a tensor via torch.tensor(tensor)

Could you check your code for any of these ops?

9 Likes

Actauly I have not callled torch.detach() nor I have trasnformed any tensor to numpy or python literal.

It is like when i am trying to do Higher order differencation of tensors it is giving me error on them.
For me : HVP_Y = torch.autograd.grad(inputs =y,outputs =DerivateX_g ,grad_outputs=B_X, retain_graph =True, allow_unused = True) is the place where i am getting the error at in my code.

Could you print the tensors y and DerivateX_g please?

I have printed both the Tensor y and DerivateX_g and put that in image for easy purpose.