How to Prevent Overfitting


#21

met similar problem. Seems something wrong in WeightedRandomSampler.

One more question, seems WeightedRandomSampler is similar to the weight parameter in nn.CrossEntropyLoss. Which one do you suggest to use? @smth

Thanks!


(Caihong1105) #22

Can you share some code as how you do background variation for images?

Thanks so much!


#23

Unfortunately I can’t as it is pretty specific to my project. But a good way to approach it would be to use OpenCV or something similar as it has a ton of image manipulation algorithms.


(Chahrazad Essalim) #24

i also encounter the same problem as @wangg12, using the above code results in running train iteration on a single batch @smth. the docs are also not clear for how to use WeightedRandomSampler with Dataloader.


#25

@Chahrazad all samplers are used in a consistent way.

You first create a sampler object, for example, let’s say you have 10 samples in your Dataset.

dataset_length = 10
epoch_length = 100 # each epoch sees 100 draws of samples
sample_probabilities = torch.randn(dataset_length)
weighted_sampler = torch.utils.data.sampler.WeightedRandomSampler(sample_probabilities, epoch_length)
torch.utils.data.DataLoader(...., sampler=weighted_sampler)

(Mamy Ratsimbazafy) #26

Here is an example repo for a Kaggle competition. I experimented with data augmentation and weighted sampling.

Data augmentation primitives are here. They inherit from a “RandomOrder” object that composes transformations. And it is called there by a dataloader


(Mohammad Mehdi Derakhshani) #27

Hi @smth, I have got a question about WeightedRandomSampler. when you create a DataLoader with a weighted sampler, how do you iterate over the DataLoader? I mean the for loop for iteration. It seems that we should draw samples from our DataLoader instead of iterating over it from first to end as simple DataLoader does (When sample attribute is None)! Could you please elaborate more on this issue?