RuntimeError: element 0 of variables does not require grad and does not have a grad_fn

Zhang_Wen · December 12, 2017, 6:18pm

hi, i have a problem here, i got a sequence of Variables which are the outputs of the bi-directional RNN, and i stacked them into a matrix xs_h whose dimension is (seq_length, batch_size, hidden_size), them i want to update the matrix xs_h by convoluting on two slices in xs_h, some codes are as follows:

new_xs_h = xs_h.clone()
vp, vc = xs_h[idx_0, bidx], xs_h[idx_1, bidx]
x = tc.stack([self.f1(vp), self.f2(vc)], dim=1)[None, :, :]
new_xs_h[idx_1, bidx] = self.tanh(self.l_f2(self.conv(x).squeeze()))

actually, i want to update the Variable xs_h and then let the new updated matrix new_xs_h get into my computation graph again. However, i got following errors when i call backward() after the running of above code:

RuntimeError: element 0 of variables does not require grad and does not have a grad_fn

i do not kown why, any reply will be appreciated.
thanks.

richard · December 12, 2017, 6:30pm

It sounds like the problem is that your xs_h don’t have requires_grad=True. Have you tried creating Variables with requires_grad=True?

Zhang_Wen · December 12, 2017, 6:37pm

thanks for reply, the Variable xs_h is not created by myself, it is the output of the Bi-RNN by feeding the embedding of the words. so the requires_grad attribute is False.

richard · December 12, 2017, 7:03pm

Okay. You can make a new Variable with requires_grad = True:

var_xs_h = Variable(xs_h.data, requires_grad=True)

Brando_Miranda · May 8, 2018, 12:45am

Did the suggestion solve your problem? I have the same error thrown at me but the error isn’t very helpful since I don’t know what of my code has the require_gradients set to false. I went ahead and set everything I could find to trainable but it still didn’t fix it…

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

gt_tugsuu · July 11, 2018, 4:01am

For me ‘loss = Variable(loss, requires_grad = True)’ worked.

I was trying to use float type of loss, and it was giving me the same error.

mnazaal · July 26, 2018, 3:05pm

Got the same error in a very different problem and gt_tugsuu 's post helped. thanks

tomorrow · September 20, 2018, 7:44am

Thank you for your answer

sonal_garg · July 8, 2019, 4:20am

Hi richard, I was also getting same error using this it got fixed but now loss is not decreasing with every epochs . It is constant. I think this is related to this variable.

Mylinda · July 16, 2019, 1:57am

Could you give the link of gt_tugsuu’s post? I meet the problem of getting the same output and loss. Thanks

Mylinda · July 16, 2019, 4:00am

Seems my problem is not caused by the same error, so I do not need the post. Thanks!

yGalindo · October 31, 2019, 10:57pm

@sonal_garg and other people that may be tempted to use loss.requires_grad=True or some similar fix, note that if you are even able to set the requires_grad of your loss, something is very wrong.

You can only set requires_grad of leaf nodes of the computation graph, that is, tensors that do not further propagate the gradient (see this explanation). If your loss does not propagate the gradient to the rest of your model, you are not training the model, and the loss is useless (even if you force it to require a gradient).

Chances are, some layer is set as requires_grad=False somewhere in the code (probably all of them or the last layer), or the computation graph is detached. Possibly the backward call is wrapped in a with torch.no_grad().

RylanSchaeffer · January 4, 2020, 3:31pm

I have the same error for different code. @richard @yGalindo , how does one generally locate the source of the error? I’ve already checked that my loss has requires_grad=True and that the model’s input to the loss function has requires_grad=True.

RylanSchaeffer · January 4, 2020, 3:50pm

I found my problem. For anyone else after me, I had a loop making multiple passes through the model and adding each pass’s loss to a running total. But I initialized the total loss to 0 outside the loop and that tensor had requires_grad=False, a property that was preserved as each pass’s loss was added to the total loss. I solved this using total_reward = torch.zeros(1, requires_grad=True).

angelinaG · January 13, 2020, 9:08am

Thank you for sharing this!
My problem was resolved right away!
loss.requires_grad = True (edited the typo)

Vijit_Mehrotra · March 12, 2020, 12:14pm

Hi Everyone i am getting RuntimeError: element 11 of tensors does not require grad and does not have a grad_fn. when I am running a GAN Architecture for getting the Gaussian Model and I am using WGANLOSS as well.

How to fix the error so that i can get th required output.

ptrblck · March 12, 2020, 11:37pm

The error is usually thrown, if you detach a tensor from the computation graph, e.g. by:

calling tensor.detach() on it directly
transforming it to a numpy array or a Python literal
recreating a tensor via torch.tensor(tensor)

Could you check your code for any of these ops?

Vijit_Mehrotra · March 13, 2020, 12:36am

Actauly I have not callled torch.detach() nor I have trasnformed any tensor to numpy or python literal.

It is like when i am trying to do Higher order differencation of tensors it is giving me error on them.
For me : HVP_Y = torch.autograd.grad(inputs =y,outputs =DerivateX_g ,grad_outputs=B_X, retain_graph =True, allow_unused = True) is the place where i am getting the error at in my code.

ptrblck · March 13, 2020, 3:53am

Could you print the tensors y and DerivateX_g please?

Vijit_Mehrotra · March 13, 2020, 6:10am

I have printed both the Tensor y and DerivateX_g and put that in image for easy purpose.