CrossEntropyLoss applied on a batch

tom.urkin · April 4, 2022, 1:24pm

Hello all,
I am trying to understand how the nn.CrossEntropyLoss() applied on a batch behaves.
Assuming I am performing a binary classification operation and the batch size is B - so the output of my CNN is of dimensions BX2.

I calculate the loss by the following:
loss=criterion(y,st) where y is the model’s output and st is the correct labels (0 or 1) and y is of dimensions BX2.

From my understanding for each entry in the batch it computes softmax and the calculates the loss.
Then it sums all of these loss values and divides the result by the batch size.

So the dimensions of ‘loss’ in this case is of a 1X1 tensor.

Did I get this right?

ksmdanl · April 4, 2022, 5:38pm

@tom.urkin that sounds about right to me. If you don’t specify the parameter “reduction”, you will get averaged loss over batch size.

tom.urkin · April 6, 2022, 3:27pm

@ksmdanl thank you very much for your response.

Is it the same for all other PyTorch loss functions? Meaning that if the loss of a batch is calculated the result is by default the mean loss at that given iteration (batch)?

Thanks in advance!

ksmdanl · April 9, 2022, 9:43am

@tom.urkin yes as far as I know that should be the case:)

mMagmer · April 9, 2022, 4:44pm

just for clarification,
it retruns scalar tensor not 1x1 tensor if you use reduction.