I need to create a dataloader that samples random datapoints from each class or even maybe given a probability distribution it samples that proportion from each class. Is this possible?
Yes you can implement a
Sampler (See here for details)
may I have an example of creating your own sampler for something like this? I’m assuming I have to got through the dataset sequentially and separate it into different classes as well?
Have a look at Detectron2’s RepeatFactorTrainingSampler, for an example of one approach. Even if you don’t use a Detectron2 model, the sampler is just standard Pytorch code and should work fine with other models. I found that it didn’t deal properly with empty images, however, and I modified it a bit for my own use.