Hi,
I am dealing with imbalanced data (mere 2% minority samples). I tried “WeightedRandomSampler” approach which only works OK for my validation set, but it fails in case of independent test set. I came across https://github.com/ufoym/imbalanced-dataset-sampler and I wanted to try this approach on my data. The problem is - this ‘ImbalancedDatasetSampler’ module can’t figure out labels in my TensorDataset object.
train_loader = data_utils.DataLoader(train_dataset, batch_size = BATCH_SIZE, sampler=ImbalancedDatasetSampler(train_dataset))
it returns error
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-13-b7cc711025fa> in <module>
2 train_loader = data_utils.DataLoader(train_dataset,
3 batch_size = BATCH_SIZE,
----> 4 sampler=ImbalancedDatasetSampler(train_dataset))
~/py_torch_sampler.py in __init__(self, dataset, indices, num_samples, callback_get_label)
30 label_to_count = {}
31 for idx in self.indices:
---> 32 label = self._get_label(dataset, idx)
33 if label in label_to_count:
34 label_to_count[label] += 1
~/py_torch_sampler.py in _get_label(self, dataset, idx)
51 return self.callback_get_label(dataset, idx)
52 else:
---> 53 raise NotImplementedError
54
55 def __iter__(self):
NotImplementedError:
Could someone tell me how can I solve this problem?