How to enable the dataloader to sample from each class with equal probability

weiyi_xie · November 6, 2018, 9:30am

In your StratifiedSampler, why you calculate n-splits as the number of batches, while you only iterate the shuffle&split iterator once? To my knowledge, n-splits defines the K in K-fold cross validation, StratifiedShuffleSplit just ensure at each cross, the distribution follows the population statistics on the whole dataset? For me, it makes sense that if your StratifiedSampler use the n_split =1 since you always reconstruct the StratifiedShuffleSplit?