Is balancing mini-batches over all the epoch required?

Hello all;

I’ve an unbalanced dataset for a 3 classes output.
I managed to have well stratified train and test sets.

My question please: Using WeightedRandomSampler gives me bad results rather than without using it. (74% vs 91% accuracies)

Is there any logical explanation for this ?

Last question: assuming all my dataset was well balanced, I’m I required to have balanced mini-batches during the epoch ? Can I have for example first mini-batches with only 2 classes, and the last ones with the last class ? Is this okay for the training ? note that this question isn’t pertaining to my use.

Thank you very much,
Habib

You might be running into the Accuracy Paradox, which explains that the accuracy might be a misleading metric for imbalanced classification use cases.
E.g. in case the majority class occurs in 91% of all cases, your model might only predict this single class to achieve the accuracy of 91%. While the model would be not very useful, the accuracy “looks good”.
Balancing the dataset might thus reduce the accuracy, but increase other metrics such as the F1 score.

Makes sense.

Thank you @ptrblck for strongly supporting this forum.