Getting RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I am doing binary image classification and using BCEWithLogitLoss.
Initally, I was getting RuntimeError: result type Float can’t be cast to the desired output type Long
So after searching, I converted the pred and target to float but now I am getting RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I really don’t have an idea where I am doing wrong -

My training loop looks like the below -

for batch_idx, (data, target) in enumerate(loader['train']):
            # move to GPU
      if torch.cuda.is_available():
          data, target ='cuda', non_blocking=True),'cuda', non_blocking = True) # noqa
      output = model(data)
      pred = torch.argmax(output, dim=1)
      loss = criterion(pred.float(), target.float()). # Conversion of pred and target to float
      train_loss += ((1 / (batch_idx + 1)) * (( - train_loss))

My transfer learning & loss function loading code snippet looks like below -

criterion_transfer = nn.BCEWithLogitsLoss()
model = timm.create_model('convnext_tiny_in22k', pretrained=True,num_classes=2) # noqa
optimizer = torch.optim.SGD(model_transfer.parameters(),

Hi @bing, you can’t differentiate torch.argmax with respect to output (as torch.argmax has no grad_fn) so you need to find another way to convert your output tensor to a prediction with an operation that has a grad_fn. A minimal example below to show that torch.argmax has no grad_fn.

import torch
output = torch.argmax(x, dim=1)
print(output.grad_fn) #returns None

You might just be able to remove the torch.argmax call as your Loss seems to expect the raw logits and replace the loss calculation as,

      loss = criterion(output.float(), target.float()). # Conversion of pred and target to float

More info in this post here (about logits with a different loss function you might find useful)

1 Like

I did tried calculating the loss function as you suggested earlier but I was getting
ValueError: Target size (torch.Size([1])) must be the same as input size (torch.Size([1, 2]))
So I pivoted to calculating the argmax
Below is the shape of my output and target

Shape of output: torch.Size([1, 2])
Shape of target: torch.Size([1])

So, I had a quick read through the docs for BCEWithLogitsLoss (docs here). These shapes represent the [batch, num_classes] respectively (and obviously should have the same size).

  1. So check your target Tensor has the right shape
  2. Or perhaps you need to reduce output to match the shape of target (as that’s what torch.argmax was effectively doing)
1 Like

I am initially trying to run the network on 1 sample only, so the target is supposed to be of shape 1, I tried unsqueezing also but it didn’t work out.
Yes, the output is supposed to be a single value but I really don’t know if not to use argmax then how to do it.
Below are my Target and output values -

Target -  tensor([0], device='cuda:0')
Output -  tensor([[0.3863, 0.1197]], device='cuda:0', grad_fn=<GatherBackward>)

If your target is a scalar for a single sample, it should have a shape of [1,1] because the shape is defined as [num_samples, size_of_one_sample] which corresponds to [1,1].

If you’re trying to get the position of the max value of output, surely you should be using torch.max instead of torch.argmax (as torch.max has a grad_fn)?


pred = torch.max(output, dim=1, keepdim=True)[0] #need the [0] to return values, [1] is indices

and make sure to have keepdim=True (so your shape is correct!), and this approach has a grad_fn,

import torch
output = torch.max(x, dim=1, keepdim=True)[0]
print(output.grad_fn) #prints MaxBackward0 
1 Like

Thanks, It worked now.