I have a NLP/text data classification problem where there is a very skewed distribution - class 0 - 98%, class 1 - 2%
For my training and validation data I am doing oversampling and my class distribution is class 0 - 55%, class 1 - 45%
.
The test data has skewed distribution
i built a model using nn.BCEWithLogitsLoss(pos_weight=tensor(1.2579, device='cuda:0'))
. pos_weight
was calculated using 55/45
(class distribution in training data.)
and on my class 1 of test data I got f1
performance of 0.07
,
true negatives, false positives, false negative, true positive = (28809, 13258, 537, 495)
I changed to focal loss using below code and my performance didnt improve a lot. f1
on class 1 of test data is still same and
true negatives, false positives, false negative, true positive = (32527, 9540, 640, 392)
kornia.losses.binary_focal_loss_with_logits(probssss, labelsss,alpha=0.25,gamma=2.0,reduction='mean')
- are my alpha and gamma parameters wrong? Are there any specific values that I should try? I could try to tune them but it might take a lot of time and resources. therefore I am looking for recommendations
- for my
nn.BCEWithLogitsLoss(pos_weight=tensor(1.2579, device='cuda:0'))
should I use any other value forpos_weight
? Please remember that my goal is to get maximumf1
performance for test data class 1