Transfer learning folder names don't match the category names

I have a tiny problem where the folder names that are selected for class named are confused for the transfer learning tutorial. Could you please guide how to fix it?

*Adding the ipynb here because it could not be loaded in gist even though I uploaded it
My category names are 1,2, … , 16
./test : 135
./test/1 : 5
./test/2 : 2
./test/3 : 4
./test/4 : 63
./test/5 : 2
./test/6 : 0
./test/7 : 12
./test/8 : 8
./test/9 : 13
./test/10 : 3
./test/11 : 1
./test/12 : 3
./test/13 : 3
./test/14 : 6
./test/15 : 2
./test/16 : 8

And the accuracies I get from the tutorial are:
class 1 --> accuracy: 80.00, correct predictions: 4, all: 5
class 2 --> accuracy: 33.33, correct predictions: 1, all: 3
class 3 --> accuracy: 0.00, correct predictions: 0, all: 1
class 4 --> accuracy: 0.00, correct predictions: 0, all: 3
class 5 --> accuracy: 0.00, correct predictions: 0, all: 3
class 6 --> accuracy: 33.33, correct predictions: 2, all: 6
class 7 --> accuracy: 0.00, correct predictions: 0, all: 2
class 8 --> accuracy: 12.50, correct predictions: 1, all: 8
class 9 --> accuracy: 50.00, correct predictions: 1, all: 2
class 10 --> accuracy: 25.00, correct predictions: 1, all: 4
class 11 --> accuracy: 93.65, correct predictions: 59, all: 63
class 12 --> accuracy: 50.00, correct predictions: 1, all: 2
class 13 --> accuracy: nan, correct predictions: 0, all: 0
class 14 --> accuracy: 91.67, correct predictions: 11, all: 12
class 15 --> accuracy: 37.50, correct predictions: 3, all: 8
class 16 --> accuracy: 69.23, correct predictions: 9, all: 13
total correct: 93, total samples: 135

As you see, here class 4 is considered to be 11.

The following is the part of the code used for calculating accuracies:


import ntpath
from torch.utils.data.sampler import WeightedRandomSampler

model_ft.eval()

nb_classes = 16

import torch.nn.functional as F

confusion_matrix = torch.zeros(nb_classes, nb_classes)

_classes = []
_preds = []
predicted_labels = []

class_probs = torch.Tensor()




im_paths = []
with torch.no_grad():
    for i, (inputs, classes, im_path) in enumerate(dataloaders['test']):
       

        im_paths.append(im_path)
        inputs = inputs.to(device)
        
        classes = classes.to(device)
        classes_list = classes.cpu().detach().numpy().tolist()
        _classes[:]=[i+1 for i in classes_list]
        outputs = model_ft(inputs)
        
  

        class_probs = class_probs.cuda()
        
        class_probs = torch.cat((class_probs, F.softmax(outputs, 1)))
            
        _, preds = torch.max(outputs, 1)
        preds_list = preds.cpu().detach().numpy().tolist()
        _preds[:]=[i+1 for i in preds_list]
          
        predicted_labels.append(preds.cpu().detach().numpy().tolist())
        for t, p in zip(classes.view(-1), preds.view(-1)):
                confusion_matrix[t.long(), p.long()] += 1
                
print(confusion_matrix)
per_class_accuracies = (confusion_matrix.diag()/confusion_matrix.sum(1)).cpu().detach().numpy().tolist()

print(','.join("{:2.04f}".format(x) for x in per_class_accuracies))
total_correct = 0
total = 0
for i in range(nb_classes):
    total_correct += int(confusion_matrix[i][i].numpy())
    total += int(confusion_matrix.sum(dim=1)[i].numpy())
    print("class {:d} --> accuracy: {:.2f}, correct predictions: {:d}, all: {:d}".format(i+1, (confusion_matrix.diag()/confusion_matrix.sum(1))[i]*100, int(confusion_matrix[i][i].numpy()), int(confusion_matrix.sum(dim=1)[i].numpy())))
    

print("total correct: {}, total samples: {}".format(total_correct, total))

flattened_im_paths = flattened = [item for sublist in im_paths for item in sublist]

print("length is: ", len(flattened_im_paths))
for i in range(len(flattened_im_paths)):
    class_p = class_probs[i].cpu().detach().numpy().tolist()

    print('{}, {}'.format(ntpath.basename(flattened_im_paths[i]), class_p))

Regarding the mismatch between the folders and the class predictions:
If you are using ImageFolder, the folders will be sorted internally, such that

1 - class0
10 - class1
11 - class2
...
2 - ...
20 - ...
21 - ...

You could append zeros in front of your folder names:

01 - class0
02 - class1
03 - ...

or write a custom Dataset with your custom folder - class mapping.

1 Like

Thank you so much. I confirm that zero-padding fixed the problem :slight_smile: