Checking accuracy using the saved models in pytorch

Hi,
I saved a model in PyTorch. The saved model gave me maximum accuracy of 89.5 % on the validation dataset. But when I plotted the confusion matrix using the saved mode, it gave me around 52 % accuracy. I do not understand this. Any tips on what is happening here?.

I am using this code:


net = torchvision.models.resnet18(pretrained = True)



train_data, valid_data, test_data = random_split(data, [int(len(data) * 0.8), int(len(data) * 0.20),
                                                            len(data) - (int(len(data) * 0.8) + int(len(data) * 0.20))])

train_loader = DataLoader(train_data, batch_size=32, shuffle=True, pin_memory=True,drop_last=True)
val_loader = DataLoader(valid_data, batch_size=32, shuffle=True, pin_memory=False,drop_last=True)
test_loader = DataLoader(test_data, batch_size=32, shuffle=True, pin_memory=False,drop_last=True)


model = net.to(device)
model.load_state_dict(torch.load("Desktop/saved models/state_dict_resnet18_pre_trained.pt"))
model.eval()


nb_classes = 2

confusion_matrix = torch.zeros(nb_classes, nb_classes)
with torch.no_grad():
    for i, (inputs, classes) in enumerate(train_loader):
        inputs = inputs.to(device)
        classes = classes.to(device)
        outputs = model(inputs)
        _, preds = torch.max(outputs, 1)
        for t, p in zip(classes.view(-1), preds.view(-1)):
                confusion_matrix[t.long(), p.long()] += 1

print(confusion_matrix)

Hi,

The accuracy is not a good metric for the classification. Did you check any micro F1 scores per class?

Anyway, on the accuracy of the validation set, was it for any particular batch or for the overall set?

In the confusion metric, you plotted the overall result on the train set? Was that the dataset the model was trained upon?

Hi,
I am editing my comment here. Accuracy was calculated over particular batch.

In that case, if you are looking for the best accuracy that does not reflect the performance over the complete dataset. It may correspond to a batch where the model performed well over the consisting samples.

Thanks for your feedback. But the accuracy over all batches was above 80%. Shouldn’t it give us accuracy better than 52%?.

You tested on the training set, right? What was your training accuracy?

Yes, on training it was around 98%.

I guess it can happen, there can be some possible explanations.

  • The model is overfitted on the training set (high bias)
  • The test data is coming from out of distribution, even away from the validation set.