CrossEntropyLoss getting value > 1


I am training my highly imbalanced sentiment classification dataset using transformers’ library’s ELECTRA(similar to BERT) model by appending a classification head on top of it. The dataset has 5 classes. And I logging the loss every 10 steps.

The imbalance dataset stats are as follows:
The number of 1 labels: 135
The number of 2 labels: 43
The number of 3 labels: 74
The number of 4 labels: 303
The number of 5 labels: 2242

The batch_size I am using is 16.

The problem I am facing is the training loss I am encountering is starts from 1.5, which reduces to about 0.7.

Following are the doubts I have:

  1. Is it theoretically possible to have loss output from nn.CrossEntropyLoss >1?
  2. Also I thought loss output comes in terms of %, so how am I supposed to interpret this loss which starts with greater than 1 value?

its is , it just gives you -(summation (log(p_i))) where p_i is the confidence in the prediction of i’th class , so in theory since natural log function is bound from 0 to - inf between 0 and 1 (since we are taking a negative of that ) the cross entory function is bound between 0 to inf


adding up to prev answer, use some metric to measure model improvement like accuracy, losses aren’t that easy to introspect, unless it’s MSE imho.

logs are bound to -inf to +inf;

  • ln(e^-100000) = -100000
  • ln(e^1000000) = 1000000

Yet as @archit-spec indicates, if \hat{y} is in the range 0-1, then the log will be within -inf to 0, and -log within 0 to +inf

First of all, Thanks for the replies.
Also then, what do you all suggest I should Log in the logger to plot, the Cross entropy loss coming from the network , or use calculate precision, recall by hand and plot it? @archit-spec @Mah_Neh

Specifically , I want to report that the model is learning, what loss/metric should I report?

I tend to log train loss, test loss, and train test accuracies as well. There is some overlap but also they provide different information.

However I do not know what is a standard metric for NLP.

Oh no, just from categorical/classification point of view u can answer. I am doing simple sentiment analysis which is just classification.
So, I should calculate the accuracy of the predictions using the target(output from model) and predicted labels, is this what you are saying? Please clarify .Thanks again.

Yes, exactly, either manually or using some torcheval metric (torcheval is a package, utils for evaluating your model.(

You can also calculate the multi / binary class confusion matrix, at least I find it useful to understand what is going on.

I do not think you can include the confusion matrix in Tensorboard though.

Thank you for the reply, I will try this.