AveragePrecision from torchmetrics as a loss function


In a binary classification task I am using AveragePrecision from torchmetrics as my loss function:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
optimizer = torch.optim.Adam(transformer.parameters(), lr=lr)
loss = AveragePrecision(task="binary", average="micro")
train_plot = []
val_plot = []

for epoch in tqdm(range(epochs), position=0, leave=True):
    train_loss = []
    val_loss = []

    for x, t in tqdm(trainloader, position=0, leave=True):
        x, t = x.to(device), t.to(device)
        y = transformer(x)
        y = y.squeeze(-1)
        J = loss(y, t.long())

When I try to do the backwards pass it is giving me a runtime error indicating that the autograd isn’t working for my loss function:

RuntimeError                              Traceback (most recent call last)
Cell In[28], line 21
     19 y = y.squeeze(-1)
     20 J = loss(y, t.long())
---> 21 J.backward()
     22 optimizer.step()
     23 train_plot.append(J.item())

File c:\Users\pietr\anaconda3\envs\optiprint\Lib\site-packages\torch\_tensor.py:492, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
    482 if has_torch_function_unary(self):
    483     return handle_torch_function(
    484         Tensor.backward,
    485         (self,),
    490         inputs=inputs,
    491     )
--> 492 torch.autograd.backward(
    493     self, gradient, retain_graph, create_graph, inputs=inputs
    494 )

File c:\Users\pietr\anaconda3\envs\optiprint\Lib\site-packages\torch\autograd\__init__.py:251, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    246     retain_graph = create_graph
    248 # The reason we repeat the same comment below is that
    249 # some Python versions print out the first line of a multi-line function
    250 # calls in the traceback and some print out the last line
--> 251 Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
    252     tensors,
    253     grad_tensors_,
    254     retain_graph,
    255     create_graph,
    256     inputs,
    257     allow_unreachable=True,
    258     accumulate_grad=True,
    259 )

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I tried using a different loss function, namely BCELoss, and everything runs perfectly fine. can AveragePrecision from torchmetrics even be used as a loss function or am I doing something wrong?

Let me know what you think, any help will be greatly appreciated.

AveragePrecision is not differentiable as indicated here and is also explicitly using a torch.no_grad() context here assuming I’ve found the correct line of code.

Thanks @ptrblck ideed it cannot be used as a loss function in this form.