CrossEntropyLoss for float type target

yuri · January 14, 2022, 4:11pm

It is obvious why CrossEntropyLoss() only accepts Long type targets. however, I ran it on Pycharm IDE with float type targets and it worked!! My target is a 2D heatmap with 0 background and a spot that has float values between 0 and 1. The same example raised an error on Colab.
The question is whether Python explicitly assigns a Long type value as a class to every different value?

anantguptadbl · January 14, 2022, 5:11pm

@yuri Please share your code snippet

KFrank · January 15, 2022, 2:17am

Hi Yuri!

As of pytorch version 1.10, CrossEntropyLoss will accept either integer
class labels (torch.int64) or per-class probabilities (torch.float32
or torch.float64) as its target.

I assume that your pycharm platform was using pytorch 1.10 or later
while colab was using a version prior to 1.10. You could probe this by
printing out torch.__version__ on both platforms.

(Using probabilistic (so-called "soft’) labels with CrossEntropyLoss is a
perfectly reasonable thing to do, and it’s nice that pytorch now supports
this directly.)

Best.

K. Frank

yuri · January 15, 2022, 9:51am

Hello K. Frank,
thank you for your reply,
actually, I checked the PyTorch version on Colab and it is 1.10 BUT what I also noticed is that the CrossEntropyLoss method internally switches to the probabilistic mode if Input and Target have the same size. The question now how does this mathematically work? especially that Target contains float with arbitrary positive values not just between 0 and 1

KFrank · January 16, 2022, 1:43am

Hi Yuri!

The formula for probabilistic mode is given the the CrossEntropyLoss
documentation.

As you have recognized, it is possible to pass in invalid values for
target in the probabilistic case, namely values that do not represent
a valid (discrete) probability distribution. That is, individual values can
lie outside of [0.0, 1.0], or the values can sum to a value other than
1.0.

CrossEntropyLoss doesn’t validate this condition – it simply applies
the formula given in the documentation and will not return a sensible
result if you violate it. (For example, if some of the target values are
negative, you could get a negative loss that could cause your training
to diverge.) It’s up to you to pass in valid target values.

Best.

K. Frank

yuri · January 16, 2022, 9:42am

Hello K Frank,
thank you for your reply.
I don’t think that Target has to have values between 0.0 and 1.0 as “weights”, the scaling parameter that will be applied to each Class could easily pull the target values out of that range.
Indeed negative values are problematic