Hi all, I want to compute the cross-entropy between two 2D tensors that are the outputs of the softmax function.

P=nn.CrossEntropyLoss(softmax_out1,softmax_out2)

softmax_out1 and softmax_out2 are 2D tensors with shapes (128,10) that 128 refers to the batch size and 10 is the number of classes.
the following error occurs:

RuntimeError: 1D target tensor expected, multi-target not supported

any example code to handle this error would be appreciated.

In Pytorch, nn.CrossEntropyLoss combines LogSoftmax and NLLLoss. Your input to nn.CrossEntropyLoss should be logits and the original targets and not the softmax probabilities themselves .

Also, it should not be used as loss=nn.CrossEntropyLoss(output, target)

but instead as below: loss = nn.CrossEntropyLoss()(output, target)

The short answer is that you have to write your own cross-entropy
function to do what you want â€“ see below.

There are two things going on here:

First, as Aman noted, the input to CrossEntropyLoss (your softmax_out1) should be raw-score logits that range from -inf to +inf, rather than probabilities that range from 0.0 to 1.0. So you
want to pass logits in as the inputwithout converting them to
probabilities by running them through softmax().

Second, CrossEntropyLoss expects its target (your softmax_out2)
to be integer class labels (with shape [nBatch], rather than [nBatch, nClass]). So CategoricalCrossEntropyWithLogitsLoss
might be a better (if lengthier) name for this loss function.

Now, how to do what you want:

Even if you write your own cross-entropy loss function, you do not
want to pass in probabilities for your input as doing so will be less
numerically stable than passing in logits.

It does, however, make sense to use probabilities (rather than integer
class labels) for your target. (These are sometimes called soft labels
or soft targets.) Itâ€™s just that pytorch doesnâ€™t offer such a version of
cross entropy.

The following post shows how to implement such a â€śsoft cross-entropyâ€ť
loss. It takes logits for its input (for numerical stability) and takes
probabilities for its â€śsoft-labelâ€ť target: