Soft Cross Entropy Loss (TF has it does Pytorch have it)

soulslicer · February 12, 2020, 3:49pm

TF supports not needing to have hard labels for cross entropy loss:

logits = [[4.0, 2.0, 1.0], [0.0, 5.0, 1.0]] labels = [[1.0, 0.0, 0.0], [0.0, 0.8, 0.2]] tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits)

Can we do the same thing in Pytorch?

KFrank · February 12, 2020, 10:12pm

Hello Raaj!

I do not believe that pytorch has a “soft” cross-entropy function built in.
But you can implement it using pytorch tensor operations, so you should
get the full benefit of autograd and gpu acceleration.

See this (pytorch version 0.3.0) script:

import torch
torch.__version__

# define "soft" cross-entropy with pytorch tensor operations
def softXEnt (input, target):
    logprobs = torch.nn.functional.log_softmax (input, dim = 1)
    return  -(target * logprobs).sum() / input.shape[0]

torch.manual_seed (2020)

# input values are logits
input  = torch.autograd.Variable (torch.randn ((2, 5)))
# target values are "soft" probabilities that sum to one (for each sample in batch)
target = torch.nn.functional.softmax (torch.autograd.Variable (torch.randn ((2, 5))), dim = 1)

input
target
softXEnt (input, target)

# make "hard" categorical target
dummy, target_cat = target.max (dim = 1)
# make "hard" one-hot target
target_onehot = torch.zeros_like (target).scatter (1, target_cat.unsqueeze (1), 1)

target_cat
target_onehot
# check that softXEnt agrees with pytorch's cross_entropy for "hard" case
torch.nn.functional.cross_entropy (input, target_cat)
softXEnt (input, target_onehot)

Here is the output:

>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> # define "soft" cross-entropy with pytorch tensor operations
... def softXEnt (input, target):
...     logprobs = torch.nn.functional.log_softmax (input, dim = 1)
...     return  -(target * logprobs).sum() / input.shape[0]
...
>>> torch.manual_seed (2020)
<torch._C.Generator object at 0x000001F948906630>
>>>
>>> # input values are logits
... input  = torch.autograd.Variable (torch.randn ((2, 5)))
>>> # target values are "soft" probabilities that sum to one (for each sample in batch)
... target = torch.nn.functional.softmax (torch.autograd.Variable (torch.randn ((2, 5))), dim = 1)
>>>
>>> input
Variable containing:
 1.2372 -0.9604  1.5415 -0.4079  0.8806
 0.0529  0.0751  0.4777 -0.6759 -2.1489
[torch.FloatTensor of size 2x5]

>>> target
Variable containing:
 0.0629  0.1508  0.5417  0.1899  0.0547
 0.0867  0.0389  0.0408  0.0659  0.7677
[torch.FloatTensor of size 2x5]

>>> softXEnt (input, target)
Variable containing:
 2.4262
[torch.FloatTensor of size 1]

>>>
>>> # make "hard" categorical target
... dummy, target_cat = target.max (dim = 1)
>>> # make "hard" one-hot target
... target_onehot = torch.zeros_like (target).scatter (1, target_cat.unsqueeze (1), 1)
>>>
>>> target_cat
Variable containing:
 2
 4
[torch.LongTensor of size 2]

>>> target_onehot
Variable containing:
 0  0  1  0  0
 0  0  0  0  1
[torch.FloatTensor of size 2x5]

>>> # check that softXEnt agrees with pytorch's cross_entropy for "hard" case
... torch.nn.functional.cross_entropy (input, target_cat)
Variable containing:
 2.2656
[torch.FloatTensor of size 1]

>>> softXEnt (input, target_onehot)
Variable containing:
 2.2656
[torch.FloatTensor of size 1]

Good luck!

K. Frank

JuanFMontesinos · February 12, 2020, 10:23pm

You can pass soft labels, no problem

KFrank · February 13, 2020, 2:46am

Hi Juan!

No, this simply isn’t true.

From the CrossEntropyLoss documentation:

Shape:

   Input: (N, C) where C = number of classes, ...

   Target: (N) where each value is 0 <= targets[i] <= C-1, ...

Raaj’s example target (“labels”):

labels = [[1.0, 0.0, 0.0], [0.0, 0.8, 0.2]]

is neither the right shape (should be (2,) rather than (2, 3)),
nor of the right values (should be one of the integers 0, 1, 2,
rather than a float, e.g., 0.8). As such, it won’t be accepted as
the target argument by CrossEntropyLoss.

Best.

K. Frank

soulslicer · February 13, 2020, 3:39am

works correctly thanks

JuanFMontesinos · February 13, 2020, 7:56am

Oh, I was thinking of Binary Cross Entropy. My bad.

KFrank · December 2, 2021, 5:35pm

As of the current stable version, pytorch 1.10.0, “soft” cross-entropy
labels are now supported. See:

CrossEntropyLoss – 1.1010.

Best.

K. Frank

ShengweiAn · July 12, 2022, 6:08pm

From the linked document, I think the current CrossEntropyLoss in PyTorch only supports the linear combination between the original target and a vector of uniform distribution, and the coefficient is denoted by label_smoothing (float, optional ) – A float in [0.0, 1.0]

Assume the ground truth target is 1. The smoothed label will be [1,0,0,0] * (1-label_smoothing) + label_smoothing*[0.25,0.25,0.25,0.25].

To support arbitrary targets, Frank’s previous code can work.

David_Toscano · January 19, 2025, 11:25pm

The function would be:
cls_score → logits
class_weight → if weighted classes , for example
list = [1/10]*number of clases
list[4] = 1
class_weight = torch.tensor(list)
label → softlabels, for example [0, 0.2, 0.8]

soft entropy loss wolud be:
lsm = F.log_softmax(cls_score, 1)
lsm = lsm * class_weight.unsqueeze(0)
loss_cls = -(label * lsm).sum(1)
loss_cls = loss_cls.sum() / torch.sum( class_weight.unsqueeze(0) * label)
loss_cls = loss_cls.mean()

and then you can loss_cls.backward()