Use Weighted Random Sampler for Imbalanced Class

Hi,
I am trying to use WeightedRandomSampler in this way

class_sample_count = [39736,949, 7807]
weights = 1 / torch.Tensor(class_sample_count)
weights = weights.double()
sampler = torch.utils.data.sampler.WeightedRandomSampler(
weights=weights,
num_samples=?,
replacement=False)
dataloaders = {x: torch.utils.data.DataLoader(image_datasets, drop_last=True, sampler = sampler, batch_size=32) for x in [‘train’, ‘valid’]}

I want the minority class samples atleast once … what num_samples should I use… also am I using it the right way as I am seeing all the samples from only the minority class? Thank You in advance …

1 Like

The weights tensor should contain a weight for each sample, not the class weights.
Have a look at this post for an example.

2 Likes

Okay I understand,
Thanks again @ptrblck… Your Sample code in the link helped alot … :slight_smile:

1 Like

@ptrblck - I m using weighted random sampler for an imbalanced class problem. I have a doubt regarding usage of replacement parameter. Does passing it as False ensure that there are no repetitions of samples within the batch size. All my class samples are more than batch size.

replacement=False will not draw the same samples during the entire epoch, so while the first batches might have a balanced class distribution, the latter ones will yield more samples of the majority classes.
Thus, I don’t think using replacement=False is a proper way to balance the data batches.