Is it a good idea to use both weighted CrossEntropyLoss and WeightedRandomSampler at the same time? Or should I only be using one at a time for imbalanced datasets? Should I use weighted CrossEntropyLoss for the training data and WeightedRandomSampler for the validation data?
t_sampler = torch.utils.data.sampler.WeightedRandomSampler(train_weights, len(t_data))
criterion = nn.CrossEntropyLoss(weight=train_weights)
The weights are created with:
from collections import Counter
def count_classes(dataset):
class_counts = dict(Counter(sample_tup[1] for sample_tup in dataset))
class_counts = dict(sorted(class_counts.items()))
return class_counts
train_class_counts = count_classes(t_data)
train_weights = [1 / train_class_counts[class_id] for class_id in range(num_classes)]
train_weights = torch.FloatTensor(train_weights)