I am unable to find code on how I can perform random over sampling and random under sampling on imbalanced dataset. Is there a pre-existing package or some code that is always used. I am relatively new to PyTorch. Any help would be appreciated!
You could use the
WeightedRandomSampler to over- or undersample the dataset.
Here is an example showing how to use it.
Thanks for the reply! Weightedrandomsampler changes the weights of class or actually oversamples/ undersamples data? Will there be any difference in the two approaches?
WeightedRandomSampler is a custom sampler, which draws the samples from your
Dataset given the specified weights and can be used to over- or undersample the dataset.
A class weighting can be enabled via the
weight argument in the criterion (e.g.
nn.CrossEntropyLoss) and would apply the class weights to the loss calculation.
Both approaches can be used to counter the effects of an imbalanced dataset, but are different in their workflow.