How to use Soft-label for Cross-Entropy loss?

oasjd7 · March 11, 2020, 8:33am

As far as I know, Cross-entropy Loss for Hard-label is:

def hard_label(input, target):
    log_softmax = torch.nn.LogSoftmax(dim=1)
    nll = torch.nn.NLLLoss(reduction='none')
return nll(log_softmax(input), target)

And then, How to implement Cross-entropy Loss for soft-label?

What kind of Softmax should I use ?
nn.Softmax() or nn.LogSoftmax() ?
How to make target labels?
Just add random noise values to the zeros.?
[0,0,1] -> [0.1,0.2,0.7]

KFrank · March 11, 2020, 4:59pm

Hello oasjd7!

(Or you can just use torch.nn.CrossEntropyLoss.)

Please see the following thread for an implementation:

You should use LogSoftmax. You have to pass the output of Softmax
through log() anyway to calculate the cross entropy, and the
implementation of LogSoftmax is numerically more stable than (the
mathematically, but not numerically equivalent) log (Softmax).

If you don’t naturally have soft target labels (probabilities across the
classes), I don’t see any value in ginning up soft labels by adding
noise to your 0, 1 (one-hot) labels. Just use CrossEntropyLoss
with your hard labels.

(If your hard labels are encoded as 0, 1-style one-hot labels you will
have to convert them to integer categorical class labels, as those are
what CrossEntropyLoss requires.)

Best.

K. Frank

saba · July 14, 2020, 12:41am

HI There,

I want to use smooth labeling with the criterion=nn.CrossEntropyLoss() with batch size of 64. The labels are random number between 0.8 to 0.9 and the outputs are from sigmoid. The code is



label=(0.9-0.8)* torch.rand(b_size) + 0.8
    label=label.to(device).type(torch.LongTensor)

    # Forward pass real batch through D
    
    netD=netD.float()
    output = netD(real_cpu).view(-1)
    # Calculate loss on all-real batch
    output1=torch.zeros(64,64)
    for ii in range(64):
        output1[:,ii]=ii
    for ii in range(64):
        output1[ii,:]= output[ii].type(torch.LongTensor)
        
    errD_real = criterion(output1, label)

and the error is:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

By applying (torch.LongTensor) all the labels and output become 0! and without (torch.LongTensor) it gave me error.

sailist · August 30, 2020, 9:47pm

My solution

ube · September 2, 2022, 7:50am

This is probably late to answer this. I am also not sure if it would work, but what if you try inserting a manual cross-entropy function inside the forward pass…
soft loss= -softlabel * log(hard label)

then apply hard loss on the soft loss the
which will be loss = -sum of (hard label * soft loss)
…but then you will have to make the softloss exp(loss)…to counteract repetitively log function.
I wonder how it would turn out.

ube · September 7, 2022, 9:12pm

https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
This is actually way better

zabboud · October 28, 2022, 6:31pm

Has anyone tested the recent implementation of passing class probabilities as opposed to class as target for the cross entropy loss (ie. soft labels)? I’ve generated soft labels as target images for my application which works well with the binary cross entropy - I’ve changed the criterion to the CrossEntropyLoss and pass a soft target image (with values [0,1] as required per the documentation), however the loss doesn’t seem to be propagating well, it reduces to 0 very quickly (despite having regularization) and the probability map outputs are not changing from the initial epoch.

if anyone has used this function successfully - any input would be helpful.

please note I’m using this for semantic segmentation - which is initially a binary classification task but changing the target input as soft labels instead of hard binary labels.

Any inputs would be helpful as to what is going on.