Soft Cross Entropy Loss (TF has it does Pytorch have it)

softmax_cross_entropy_with_logits

TF supports not needing to have hard labels for cross entropy loss:

logits = [[4.0, 2.0, 1.0], [0.0, 5.0, 1.0]] labels = [[1.0, 0.0, 0.0], [0.0, 0.8, 0.2]] tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits)

Can we do the same thing in Pytorch?

1 Like

Hello Raaj!

I do not believe that pytorch has a “soft” cross-entropy function built in.
But you can implement it using pytorch tensor operations, so you should
get the full benefit of autograd and gpu acceleration.

See this (pytorch version 0.3.0) script:

import torch
torch.__version__

# define "soft" cross-entropy with pytorch tensor operations
def softXEnt (input, target):
    logprobs = torch.nn.functional.log_softmax (input, dim = 1)
    return  -(target * logprobs).sum() / input.shape[0]

torch.manual_seed (2020)

# input values are logits
input  = torch.autograd.Variable (torch.randn ((2, 5)))
# target values are "soft" probabilities that sum to one (for each sample in batch)
target = torch.nn.functional.softmax (torch.autograd.Variable (torch.randn ((2, 5))), dim = 1)

input
target
softXEnt (input, target)

# make "hard" categorical target
dummy, target_cat = target.max (dim = 1)
# make "hard" one-hot target
target_onehot = torch.zeros_like (target).scatter (1, target_cat.unsqueeze (1), 1)

target_cat
target_onehot
# check that softXEnt agrees with pytorch's cross_entropy for "hard" case
torch.nn.functional.cross_entropy (input, target_cat)
softXEnt (input, target_onehot)

Here is the output:

>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> # define "soft" cross-entropy with pytorch tensor operations
... def softXEnt (input, target):
...     logprobs = torch.nn.functional.log_softmax (input, dim = 1)
...     return  -(target * logprobs).sum() / input.shape[0]
...
>>> torch.manual_seed (2020)
<torch._C.Generator object at 0x000001F948906630>
>>>
>>> # input values are logits
... input  = torch.autograd.Variable (torch.randn ((2, 5)))
>>> # target values are "soft" probabilities that sum to one (for each sample in batch)
... target = torch.nn.functional.softmax (torch.autograd.Variable (torch.randn ((2, 5))), dim = 1)
>>>
>>> input
Variable containing:
 1.2372 -0.9604  1.5415 -0.4079  0.8806
 0.0529  0.0751  0.4777 -0.6759 -2.1489
[torch.FloatTensor of size 2x5]

>>> target
Variable containing:
 0.0629  0.1508  0.5417  0.1899  0.0547
 0.0867  0.0389  0.0408  0.0659  0.7677
[torch.FloatTensor of size 2x5]

>>> softXEnt (input, target)
Variable containing:
 2.4262
[torch.FloatTensor of size 1]

>>>
>>> # make "hard" categorical target
... dummy, target_cat = target.max (dim = 1)
>>> # make "hard" one-hot target
... target_onehot = torch.zeros_like (target).scatter (1, target_cat.unsqueeze (1), 1)
>>>
>>> target_cat
Variable containing:
 2
 4
[torch.LongTensor of size 2]

>>> target_onehot
Variable containing:
 0  0  1  0  0
 0  0  0  0  1
[torch.FloatTensor of size 2x5]

>>> # check that softXEnt agrees with pytorch's cross_entropy for "hard" case
... torch.nn.functional.cross_entropy (input, target_cat)
Variable containing:
 2.2656
[torch.FloatTensor of size 1]

>>> softXEnt (input, target_onehot)
Variable containing:
 2.2656
[torch.FloatTensor of size 1]

Good luck!

K. Frank

6 Likes

You can pass soft labels, no problem

Hi Juan!

No, this simply isn’t true.

From the CrossEntropyLoss documentation:

Shape:

   Input: (N, C) where C = number of classes, ...

   Target: (N) where each value is 0 <= targets[i] <= C-1, ...

Raaj’s example target (“labels”):

labels = [[1.0, 0.0, 0.0], [0.0, 0.8, 0.2]]

is neither the right shape (should be (2,) rather than (2, 3)),
nor of the right values (should be one of the integers 0, 1, 2,
rather than a float, e.g., 0.8). As such, it won’t be accepted as
the target argument by CrossEntropyLoss.

Best.

K. Frank

4 Likes

works correctly thanks

Oh, I was thinking of Binary Cross Entropy. My bad.

As of the current stable version, pytorch 1.10.0, “soft” cross-entropy
labels are now supported. See:

CrossEntropyLoss – 1.1010.

Best.

K. Frank

From the linked document, I think the current CrossEntropyLoss in PyTorch only supports the linear combination between the original target and a vector of uniform distribution, and the coefficient is denoted by label_smoothing (float, optional ) – A float in [0.0, 1.0]

Assume the ground truth target is 1. The smoothed label will be [1,0,0,0] * (1-label_smoothing) + label_smoothing*[0.25,0.25,0.25,0.25].

To support arbitrary targets, Frank’s previous code can work.