Sklearn loss function

I am trying to use sklearn.metrics.average_precision_score as the loss function as follows:

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for i, (region, labels) in tqdm.tqdm(enumerate(train_loader)):  
        region = region.to(device)
        labels = labels.to(device)
        # Forward pass
        outputs = model(region)
        ap = -average_precision_score(labels.float().detach().numpy(), outputs.squeeze().detach().numpy())
        loss = torch.FloatTensor([ap])
        loss.requires_grad_(True)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

I am trying to maximize the average precision score, hence the negative sign (for some reason, the loss stays the same throughout training with and without the negative sign). What is going wrong here, and it is okay to use a loss function from sklearn as shown above? Thank you.

I was just wondering, did you try to wrap the sklearn loss function inside a custom loss function?

@mxahan Thanks for the suggestion. Can you show me or direct me to where I can see how that can be done?

def custom_loss(outputs, labels):
    loss = torch.sum(-average_precision_score(labels, outputs))
    return loss

Does it work?

Unfortunately, the loss still remains constant at every epoch after fixing the loss function the way you suggested. Here’s my new loss function:

def custom_loss(labels, outputs):
    loss = torch.FloatTensor([-average_precision_score(labels, outputs)])
    loss.requires_grad_(True)
    return loss


in training:

loss = custom_loss(labels.int().detach().numpy(), outputs.squeeze().detach().numpy())

loss.requires_grad_(True) seems necessary, leaving it out causes RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn . However, the fact that this is necessary makes me wonder if PyTorch is keeping the loss as a variable to minimize. Any thoughts on this? Thanks!

The error message points to a detached computation graph and while calling .requires_grad_() on the loss tensor would get rid of this error message, it would not fix the underlying problem.

You are currently breaking the computation graph in various places:

  • calling .detach() on a tensor detaches it from the computation graph obviously
  • calling numpy() also detaches the tensor form the computation graph. In general: using any 3rd party library will detach the tensor, as Autograd won’t be able to track these operations
  • re-creating a new tensor via torch.Floatensor(tensor) will detach tensor from the computation graph

If you want to use a 3rd party library such as sklearn.metrics.average_precision_score, you could use it in a custom autograd.Function and implement the backward pass manually.
The first thing I would check is if this method is differentiable at all. If so, you could also try to re-implement it in PyTorch directly.

1 Like