Loss calculation in multi-class classifier Neural Network

Anshu_Garg · October 17, 2020, 10:59pm

Hi All,

I am new to Pytorch and ML development. I have a basic doubt.
I am implementing fully connected Neural Network and using pytorch dataset and torch.utils.data.DataLoader on MNIST dataset for handwritten digit classification.
Suppose the batch_size passed to torch.utils.data.DataLoader is 5, after each batch run, I will get model output which has dimension (5,10) (let’s call it predicted_output). 5 - represent batch size and 10 represent the number of classes.
We calculate the loss for each batch.
Data Loader will provide us the labels list which size will be (5,1) (let’s call it original_labels)

So, doubt is - when we use any loss function (let’s say nn.MSELoss) then we pass (predicted_output and original_labels).
But ‘original_labels’ - contains actual labels (for example [5, 8, 9, 3, 0] (column vector) etc)
and ‘predicted_output’ contains a probability for each class. So, for one sample input, we have a row of size (1,10), where for each class there is a probability associated.

So, ideally we should convert ‘original_labels’ list to (5,10) size and then pass to loss function. For each row probability should be 1 for the actual label and 0 for other labels (for example: if label/class is digit 7: then row should be [0,0,0,0,0,0,0,1,0,0]).

Or does this work is internally done by nn.MSELoss function?
Should I convert original_labels array to (5,10) format before sending to loss function (for example, change [5 8 9 3 0] (shape is 5,1) —> to format [[0,0,0,0,0,1,0,0,0,0], [0,0,0,0,0,0,0,0,1,0], [0,0,0,0,0,0,0,0,0,1], [0,0,0,1,0,0,0,0,0,0], [1,0,0,0,0,0,0,0,0,0]] ) ?

KFrank · October 17, 2020, 11:27pm

Hi Anshu!

The short answer is use CrossEntropyLoss.

This is fine. You want the input to CrossEntropyLoss (the output
of your model) to have shape [nBatch = 5, nClass = 10].

I don’t know what DataLoader provides, but you want the target
for CrossEntropyLoss (your labels) to have shape [nBatch = 5].

If you get labels with shape [nBatch, 1], squeeze() it to get rid of
the singleton dimension.

This is what you want. CrossEntropyLoss expects a target that is a
LongTensor of integer class labels that range from 0 to nClass - 1.

Just to be clear, MSELoss is not a very good loss function for
classification. Use CrossEntropyLoss.

Best.

K. Frank

Anshu_Garg · October 17, 2020, 11:32pm

Thank you for your reply. I am talking about torch.utils.data.DataLoader.