Why can the dimensions of input and target be different when using this class of torch.nn.CrossEntropyLoss?

SangYC · February 8, 2020, 4:52am

Why can the dimensions of input and target be different when using this class of torch.nn.CrossEntropyLoss? I remember that one_hot processing was needed when using keras. This’s why?

ptrblck · February 8, 2020, 4:54am

nn.CrossEntropyLoss expects the target as a LongTensor containing the class indices rather than a one-hot encoded target.
Since you are indexing the logits for the target of the corresponding sample, you can use the class indices directly.
Have a look at the docs for more information.

SangYC · February 8, 2020, 7:06am

Thanks for your help.