When we use WeightedRandomSampler for a mutliclass problem, we set reduction of the loss to mean? (always?) and what this ‘mean’ means?
The usage of a WeightedRandomSampler
is independent of the used loss function.
For e.g. nn.CrossEntropyLoss
, the reduction='mean'
uses
The losses are averaged across observations for each minibatch.
as explained in the docs.
Thanks,
Could you explain me briefly what’s the purpose of WeightedRandomSampler? I found an answer you gave here few years ago and I used it, everything ok but I don’t know how exactly I did for my imbalanced data.
Also is it ok that I used both WeightedRandomSampler with reduction = mean?
WeightedRandomSampler
is used to provide weights to each sample, which is used during the sampling process of selecting the data samples for each batch.
The higher the weight assigned to a particular index, the more likely this data sample will be used in a batch.
To created batches of balanced classes, you would therefore assign a higher weight to samples from the minority class, and a lower weight to samples from the majority class.
Yes, the reduction is independent from the sampling strategy.
so according to you, if we use WeightedRandomSampler
then we do not need to use weights in the cross-entropy loss function, right? If your answer is yes then is there any benefit of using WeightedRandomSampler
over weights in cross-entropy loss?
You could use either, as both approaches have the same goal: to counter issues while training on an imbalanced dataset (i.e. overfitting the majority class(es)). Users have different experience with both approaches and I’m sure some might prefer one over the other. I haven’t seen use cases where both methods were used simultaneously.