I have a very skewed dataset in which number of class samples is [74,859]. My training method is correct so I feel I am making an error is using the weighted random sampler and weights in cross entropy. Is my implementation correct? I used WeightedRandomSampler from train dataloader and SubsetRandomSampler for test dataloader.
splits = 10
kfold = KFold(n_splits=splits,shuffle=False)
for fold,(train_idx,test_idx) in enumerate(kfold.split(data)):
print(f'FOLD {fold}')
print('--------------------------------')
class_sample_count = [74,859].
weights = 1 / torch.Tensor(class_sample_count)
weights = weights.double()
train_sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, 10)
test_sampler = SubsetRandomSampler(test_idx)
loaders = {
'train': torch.utils.data.DataLoader(data, batch_size=16, sampler=train_sampler),
'test': torch.utils.data.DataLoader(data, batch_size=16, sampler=test_sampler)
}
vgg16 = torchvision.models.vgg16(pretrained=True)
for param in vgg16.features.parameters():
param.requires_grad = False
num_ftrs = vgg16.classifier[6].in_features
vgg16.classifier[6] = nn.Linear(num_ftrs, 2)
model = vgg16.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
train(3, loaders, model, optimizer, criterion, train_on_gpu, 'model.pt')
model.load_state_dict(torch.load('model.pt'))
Output:
Batch: 0 Epoch: 1 Training Loss: 0.666256
Epoch: 1 Training Loss: 0.666256 Validation Loss: 11.949794 Accuracy: 0.237942
Validation loss decreased (inf --> 11.949794). Saving model ...
Batch: 0 Epoch: 2 Training Loss: 0.000000
Epoch: 2 Training Loss: 0.000000 Validation Loss: 32.636715 Accuracy: 0.237942
Batch: 0 Epoch: 3 Training Loss: 0.000000
Epoch: 3 Training Loss: 0.000000 Validation Loss: 58.557861 Accuracy: 0.237942
FOLD ACCURACY : 0.2379421221864952
Batch: 0 Epoch: 1 Training Loss: 0.561094
Epoch: 1 Training Loss: 0.561094 Validation Loss: 22.119741 Accuracy: 0.000000
Validation loss decreased (inf --> 22.119741). Saving model ...
Batch: 0 Epoch: 2 Training Loss: 0.000000
Epoch: 2 Training Loss: 0.000000 Validation Loss: 53.926662 Accuracy: 0.000000
Batch: 0 Epoch: 3 Training Loss: 0.000000
Epoch: 3 Training Loss: 0.000000 Validation Loss: 91.861961 Accuracy: 0.000000
FOLD ACCURACY : 0.0
Is using WeightedRandomSampler and class weights in cross entropy toghether alright? Also should I use WeightedRandomSampler for testloader as well?