I have a binary classification problem with highly imbalanced data (250 negatives for every 1 positive). If I use NLLLoss (or CrossEntropyLoss), what should the class weights be?
This is assuming that your first label is ‘negative’ and second label is ‘positive’. I am wondering if a better choice would be to use [1, 250] if the gradient values are very small.