Sklearn loss function

111242 · April 23, 2022, 1:24am

I am trying to use sklearn.metrics.average_precision_score as the loss function as follows:

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for i, (region, labels) in tqdm.tqdm(enumerate(train_loader)):  
        region = region.to(device)
        labels = labels.to(device)
        # Forward pass
        outputs = model(region)
        ap = -average_precision_score(labels.float().detach().numpy(), outputs.squeeze().detach().numpy())
        loss = torch.FloatTensor([ap])
        loss.requires_grad_(True)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

I am trying to maximize the average precision score, hence the negative sign (for some reason, the loss stays the same throughout training with and without the negative sign). What is going wrong here, and it is okay to use a loss function from sklearn as shown above? Thank you.

mxahan · April 23, 2022, 4:21am

I was just wondering, did you try to wrap the sklearn loss function inside a custom loss function?

111242 · April 23, 2022, 4:37am

@mxahan Thanks for the suggestion. Can you show me or direct me to where I can see how that can be done?

mxahan · April 23, 2022, 7:07pm

def custom_loss(outputs, labels):
    loss = torch.sum(-average_precision_score(labels, outputs))
    return loss

Does it work?

111242 · April 23, 2022, 8:59pm

Unfortunately, the loss still remains constant at every epoch after fixing the loss function the way you suggested. Here’s my new loss function:

def custom_loss(labels, outputs):
    loss = torch.FloatTensor([-average_precision_score(labels, outputs)])
    loss.requires_grad_(True)
    return loss

…
in training:

loss = custom_loss(labels.int().detach().numpy(), outputs.squeeze().detach().numpy())

loss.requires_grad_(True) seems necessary, leaving it out causes RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn . However, the fact that this is necessary makes me wonder if PyTorch is keeping the loss as a variable to minimize. Any thoughts on this? Thanks!

ptrblck · April 23, 2022, 9:17pm

The error message points to a detached computation graph and while calling .requires_grad_() on the loss tensor would get rid of this error message, it would not fix the underlying problem.

You are currently breaking the computation graph in various places:

calling .detach() on a tensor detaches it from the computation graph obviously
calling numpy() also detaches the tensor form the computation graph. In general: using any 3rd party library will detach the tensor, as Autograd won’t be able to track these operations
re-creating a new tensor via torch.Floatensor(tensor) will detach tensor from the computation graph

If you want to use a 3rd party library such as sklearn.metrics.average_precision_score, you could use it in a custom autograd.Function and implement the backward pass manually.
The first thing I would check is if this method is differentiable at all. If so, you could also try to re-implement it in PyTorch directly.