Hi,
I am trying to use WeightedRandomSampler in this way
class_sample_count = [39736,949, 7807]
weights = 1 / torch.Tensor(class_sample_count)
weights = weights.double()
sampler = torch.utils.data.sampler.WeightedRandomSampler(
weights=weights,
num_samples=?,
replacement=False)
dataloaders = {x: torch.utils.data.DataLoader(image_datasets, drop_last=True, sampler = sampler, batch_size=32) for x in [‘train’, ‘valid’]}
I want the minority class samples atleast once … what num_samples should I use… also am I using it the right way as I am seeing all the samples from only the minority class? Thank You in advance …
1 Like
The weights
tensor should contain a weight for each sample, not the class weights.
Have a look at this post for an example.
2 Likes
Okay I understand,
Thanks again @ptrblck… Your Sample code in the link helped alot …
1 Like
@ptrblck - I m using weighted random sampler for an imbalanced class problem. I have a doubt regarding usage of replacement parameter. Does passing it as False ensure that there are no repetitions of samples within the batch size. All my class samples are more than batch size.
replacement=False
will not draw the same samples during the entire epoch, so while the first batches might have a balanced class distribution, the latter ones will yield more samples of the majority classes.
Thus, I don’t think using replacement=False
is a proper way to balance the data batches.