How to use WeightedRandomSampler with KFold?

Arohan_Ajit · July 31, 2021, 10:12am

I used the following code for weighted random sampling in my dataset:

     class_weights = [250,859]
     sample_weights = [0] * len(data)
     for idx, (inputs,label) in enumerate(data.imgs):
         class_weight = class_weights[label]
         sample_weights[idx] = class_weight

However when using stratified kfold, how can I use WeightedRandomSampler with the train and test idx provided by each split?

My implementation(which probably is incorrect since it used all the data for sampling as opposed to idx in the split):

splits = 2
kfold = StratifiedKFold(n_splits=splits,shuffle=True,random_state=2)
for fold,(train_idx,test_idx) in enumerate(kfold.split(data,data.targets)):
    print(f'FOLD {fold}')
    print('--------------------------------')

    print(train_idx)
    class_weights = [250,859]
    sample_weights = [0] * len(data)
    for idx, (inputs,label) in enumerate(data.imgs):
        class_weight = class_weights[label]
        sample_weights[idx] = class_weight

    train_sampler = WeightedRandomSampler(sample_weights, num_samples=len(sample_weights), replacement=True)

    test_sampler = SubsetRandomSampler(test_idx)
    loaders = {
    'train': torch.utils.data.DataLoader(data, batch_size=16, sampler=train_sampler),
    'test': torch.utils.data.DataLoader(data, batch_size=16, sampler=test_sampler)
    }

Asma_Bouzidi · June 14, 2022, 12:22pm

I have the same problem did you find a solution, please?