Does nn.CrossEntropyLoss internally build a one-hot encoding of the target?

ptrblck · August 12, 2022, 9:23am

Yes, you can directly index the logit as seen in this small example which calculates the loss manually using an explicit but slow approach.

No, as it would be wasteful to create the one-hot encoded target first and multiply it with the full logit tensor. The result would be equal to indexing just the logit where the one-hot encoded target would have its 1 value. The limitation is of course that indexing won’t work with “soft-targets”, i.e. target tensors which contain probabilities and not class indices. For this use case nn.CrossEntropyLoss accepts a FloatTensor with probabilities in newer PyTorch releases.

That’s not true, as you would multiply the logits of the uninteresting classes with a zero, so these values are not changing the loss at all.