I’m facing this error RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn when I try to make the backward pass on the training of a ResNet + deconvolutional layer. I explain better the situation:

I define my model using a pre-trained ResNet-50 in which I removed the last 2 blocks layer (the error remains also if I use the complete model). After the ResNet blocks instead of the fully connected layer I use a deconvolutional layer (conv-transpose) because in my task I have to solve a regression problem.
After the deconvolutional layer I perform some numpy operations to extract the prediction of my model. The workflow is this:

result = self.resnet_encoder(x) # the resnet layers
result = self.decoder(result) # the convolutional-transpose layer
output = # a series of basic numpy operations here

Now if I define my loss function (a MSE) on the result tensor, all works good: the model is able to do the backpropagation pass, but if instead I define the loss on the output tensor (even if the loss is calculated correctly) I face the error that I wrote above.

The loss is defined like this:

loss_fun = nn.MSELoss()
# result = my model result prediction
# result_dataset = could be seen as a ground truth expectation of the result
decoder_loss = loss_fun(result, torch.from_numpy(result_dataset))
# output = my model output prediction
# output_dataset = ground truth output expectation
output_loss = loss_fun(torch.from_numpy(output), torch.from_numpy(output_dataset))

The error appears when I tried to do:

output_loss.backward()
# decoder_loss.backward() => this instead works correctly

The fact is that my labels are based on the result of the output tensor so I need to find a way to compute the loss and do the backprop pass basing on that value.

If you are leaving PyTorch and use 3rd party libs, such as numpy, you will detach these operations from the computation graph, and PyTorch won’t be able to calculate the gradients for them.
You could either implement custom autograd.Functions as described here or use PyTorch operations instead.

A solutions that seems to work for me now is computing the loss using both the result from the Pytorch graph and also the result of the final numpy part. So for making more clear the loss now is calculated like this:

It seems that using this loss and passing to it in the var1 the result of the first part of the graph and in the var2 the result of the last part of the graph with the numpy operations Pytorch is able to compute the backward pass. Even if it is not clear to me why in this case is able to do that and not in the previous.

You could either implement custom autograd.Function s as described here

If in this case I want to build a my defined autograd.Function, given that the numpy part has not weights and so not gradient, do you think that the solution would be to simply return the gradient of the output, don’t you?

The loss part created by numpy would be treated as a constant value, similar to:

loss = 0.5 * valid_pytorch_loss + 1.

where the 1. could be any value provided by the numpy loss function and would thus not be used in calculating the gradients.
The error is not raised, since the PyTorch loss is still valid and you are of course free to add constant values to it.