Inconsistency between loss-functions input Types

Jakob · December 9, 2021, 12:34pm

When using the nn.BCEloss the target has to be a tensor with dtype Float whereas nn.CrossEntropyLoss it has to be Long.

Why is that so? Why not stick to either Long or Float?

It make it difficult to create a wrapper since we cannot cast the input/target to a specific (static) dtype, thus one either have to make a try/catch statement, do the conversion based on an error-message, which of calls for a lot of unnecessary checks

KFrank · December 9, 2021, 2:36pm

Hi Jakob!

Historically, pytorch’s CrossEntropyLoss was restricted to integer
class labels for its target and did not support probabilistic “soft”
labels. These are naturally integers, hence Long. In contrast,
BCELoss (and its to-be-preferred sibling, BCEWithLogitsLoss)
support (and require) probabilities for target, hence Float and
Double.

However, as of pytorch version 1.10.0, CrossEntropyLoss supports
both categorical (integer) and probabilistic target values, so this
inconsistency should be gone now.

Best.

K. Frank