I would like to ask the difference between batch size in data loader and num_samples in Weighted Random Sampler.
I used to assign num_samples as
num_samples=len(sample_weights) and my dataloader batch size is 16. Then after about 50 epochs, I changed my num_samples to num_samples=16 and the training accuracy went down, though my validation accuracy did not change much.
I am confused, I know num_samples means the number of samples we draw for each iteration, but is it the same as data loader batch size?
Thank you in advance
num_samples argument in
WeightedRandomSampler defines the number of samples drawn in each epoch, while the
batch_size in the
DataLoader defines the number of samples drawn in each iteration.
Thanks for clarifying @ptrblck , I wonder which one is better drawing samples from entire length of dataset or drawing less?
The common approach would be to allow the sampler to use all samples unless you have a valid reason to reduce the number of samples and decrease the number of samples in each epoch.
Note that even then you would still be able to draw all samples, but would need to increase the number of epochs to draw the same number of samples.