WeightedRandomSampler - Imbalanced classes Multiclass classification

When we use WeightedRandomSampler for a mutliclass problem, we set reduction of the loss to mean? (always?) and what this ‘mean’ means?

The usage of a WeightedRandomSampler is independent of the used loss function.
For e.g. nn.CrossEntropyLoss, the reduction='mean' uses

The losses are averaged across observations for each minibatch.

as explained in the docs.

Thanks,

Could you explain me briefly what’s the purpose of WeightedRandomSampler? I found an answer you gave here few years ago and I used it, everything ok but I don’t know how exactly I did for my imbalanced data.

Also is it ok that I used both WeightedRandomSampler with reduction = mean?

WeightedRandomSampler is used to provide weights to each sample, which is used during the sampling process of selecting the data samples for each batch.
The higher the weight assigned to a particular index, the more likely this data sample will be used in a batch.
To created batches of balanced classes, you would therefore assign a higher weight to samples from the minority class, and a lower weight to samples from the majority class.

Yes, the reduction is independent from the sampling strategy.

1 Like

so according to you, if we use WeightedRandomSampler then we do not need to use weights in the cross-entropy loss function, right? If your answer is yes then is there any benefit of using WeightedRandomSampler over weights in cross-entropy loss?

You could use either, as both approaches have the same goal: to counter issues while training on an imbalanced dataset (i.e. overfitting the majority class(es)). Users have different experience with both approaches and I’m sure some might prefer one over the other. I haven’t seen use cases where both methods were used simultaneously.