Can we use cross entropy loss for binary classification

donekpr1 · August 18, 2022, 2:03pm

Hi All,
I want to confirm if we can use cross entropy for binary class with single label clasisfication .I am using Roberta model and the data set is highly imbalanced(with more 1s and less 0 s).I tried to use weighted random sampler but my F1 score is very low (0.29) for 0 and (0.75) for 1s .I felt this is not working .
I tried to use bcewithlogitloss along with pos_weight but loss is showing up as NA.

So,I thought to use cross entropy loss with class weight computed using sklearn computer class weight.Hope I am doing it right?

Appreciate if you can confirm these two things as asked
1.Can I use cross entropy loss for binary classification in the above case?
2.If so,can I use compute class weight of sklearn for calculating class weights?

Request to assist in this regard.

ptrblck · August 19, 2022, 4:20am

Yes, you can use nn.CrossEntropyLoss for a binary classification use case and would treat it as a 2-class multi-class classification use case. In this case your model should output 2 logits instead of 1 as would be the case for a binary classification using nn.BCEWithLogitsLoss.
I don’t know which scikit-learn method you want to use, but guess these class weights might be a starter to check if your training would improve.

donekpr1 · August 19, 2022, 3:56pm

Thanks for your response.
I have used bcewithlogitloss() and pos weight calculated as below
y_train=torch.Tensor(y_train)

num_positives = torch.sum(y_train, dim=0)

num_negatives = len(train_dataset) - num_positives

pos_weight = num_negatives / num_positives

pos_weight

For me no of 0s are very less than number of 1s .When I used this method or even with weighted random sampler for which I computed class weights as below
y_train_indices = train_dataset.indices

y_train = [target[i] for i in y_train_indices]

class_sample_count = np.array(

[len(np.where(y_train == t)[0]) for t in np.unique(y_train)])

print(class_sample_count)

weight = 1. / class_sample_count

samples_weight = np.array([weight[t] for t in y_train])

samples_weight = torch.from_numpy(samples_weight)

print(samples_weight)

sampler = torch.utils.data.sampler.WeightedRandomSampler(samples_weight.type(‘torch.DoubleTensor’), len(samples_weight))

There is no improvement in the number of 0 s of F1 score and is resulted as below ,
precision recall f1-score support

       0       0.23      0.75      0.35         4
       1       0.93      0.58      0.72        24

accuracy                           0.61        28

macro avg 0.58 0.67 0.54 28
weighted avg 0.83 0.61 0.67 28

I ran for 3 epochs and both the methods were tried by substituting them in the loop of train data loader.
The validation loss is not decreasing as you see above for both the methods .
Hence I wanted to try now using cross entropy loss for which I am using the below ,
np.unique(train_dataset.indices)

y_train = [target[i] for i in y_train_indices]

np.unique(y_train)

class_wts = compute_class_weight(“balanced”,classes= np.unique(y_train),y=y_train)

class_wts
and substituting them in the cross entropy loss.

I have read many posts of this forum regarding the first two methods but in my case its overfitting .I am not sure as I have binary class with single label,the pos weight I have taken as a single value which is tensor 6.5 resulted from the above .
Kindly correct me if there is any step that needs to be corrected for all the 3 methods.
Thanks.