Multi-class focal loss for label with probabilities (mixup)?

mzimmerman · December 8, 2022, 9:12am

I’ve been trying to use mixup with focal loss for my multi-class label training. I patch up some codes taken from various open-sources project for multi-class focal loss. However, I’m not able to modify it for label with probabilities (eg. mixup augmentation) rather than binary. Here is my code for one-hot type label:

def focal_loss(p, t, alpha=None, gamma=2.0):
    loss_val = F.cross_entropy(p, t, weight=alpha, reduction='none')
    p = F.softmax(p, dim=1)

    n_classes = p.size(1)
    # This is where I couldn't modify it for probabilities type label
    t = t.unsqueeze(1)
    shape = t.shape
    target_onehot = torch.zeros(shape[0], n_classes, *shape[2:], dtype=t.dtype, device=t.device)
    target_onehot.scatter_(1, t, 1.0)

    focal_weight = 1 - torch.where(torch.eq(t, 1.), p, 1 - p)
    focal_weight.pow_(gamma)
    focal_weight.to(p.device)

    if len(loss_val.shape) < len(focal_weight.shape):
        loss_val = loss_val.unsqueeze(1)

    # compute loss
    alpha_weight  = 1.0
    focal_loss = focal_weight * alpha_weight * loss_val
    return torch.mean(focal_loss)

To use it,

num_classes = 15
pred = torch.rand((16, 50, num_classes), dtype=torch.float)
label = (torch.rand((16, 50, num_classes)) > 0.5).long()
# Expected label type for mixup is:
# label = torch.rand((16, 50, num_classes), dtype=torch.float)

loss = focal_loss(pred.reshape(-1, num_classes), label.view(-1, num_classes), alpha=None)

Anyone with more knowledge on focal loss could help me with this? Many thanks in advance.