Hello all,
I am trying to understand how the nn.CrossEntropyLoss() applied on a batch behaves.
Assuming I am performing a binary classification operation and the batch size is B - so the output of my CNN is of dimensions BX2.
I calculate the loss by the following:
loss=criterion(y,st) where y is the model’s output and st is the correct labels (0 or 1) and y is of dimensions BX2.
From my understanding for each entry in the batch it computes softmax and the calculates the loss.
Then it sums all of these loss values and divides the result by the batch size.
So the dimensions of ‘loss’ in this case is of a 1X1 tensor.
Did I get this right?