WeightedRandomSampler with weights per class

Hi all,
I’m currently using the WeightedRandomSampler by passing a weights tensor with the same size of the number of samples, so one weight per sample and it’s working fine. I was wondering if there’s a way to specify weights per classes, since I’m currently working with millions of samples and the weights tensor is therefore of the same size.

Thanks.

No, I don’t think that’s currently possible. Note however, that 1 million float32 values take approx. ~4MB of memory, so unsure how much memory you would save in the end.

1 Like

Thanks for the prompt reply. Indeed, memory is not an issue, I have ~150m samples, so that is ~570MB, still feasible, the time-consuming part is to scan all the samples at the beginning of the training to get the class and create the weight tensor, but I guess I would need to find a smarter way to do that.

Thanks again.

Side note: the WeightedRandomSampler is based on torch.multinomial and I just realized that torch.multinomial doesn’t support arrays larger than 2^24 (<17M) samples.