Argmax translation to which category

rmnvcc · September 15, 2018, 2:46pm

I made a custom dataset with dataloader, which was about 3 different categories images. I used vgg16 to predict which category was in the image.

If i want to predict a single image however, i would get back something like this:

tensor([[-0.0559, -1.6212, -0.3467]], grad_fn=)

How would I know which of the categories correspons with index 0 or index 1?

I’ve seen other problems where do something like this:

net = torch.load(‘pytorch_Network2.h5’)
idx_to_class = {
0: ‘airplane’,
1: ‘automobile’,
2: ‘bird’,
}
output = net(img)

pred = torch.argmax(output, 1)

for p in pred:

cls = idx_to_class[p.item()]

print(cls)

But then, how would you know index 0 is a plane? Does that have to do with how you build your dataset? in that case, This is how i’ve built it:

class SophistyDataSet(Dataset):
    """Dataset wrapping images and target labels for 3 categories 'Modern','Vintage','Classic'

    Arguments:
        A CSV file path
        Path to image folder
        Extension of images
        PIL transforms
    """

    def __init__(self, directory_list, img_path):
        raw_images_list = self.prepare_dataset(directory_list)
        images_df = pd.DataFrame(raw_images_list, columns=['name', 'tag'])
        self.mlb = MultiLabelBinarizer()
        self.transform = transforms.Compose([transforms.Resize((IMG_SIZE, IMG_SIZE)), transforms.ToTensor()])
        self.img_path = img_path
        self.X_train = images_df['name']
        self.y_train = self.mlb.fit_transform(images_df['tag'].str.split()).astype(np.float32)

    def __getitem__(self, index):
        img = Image.open(
            self.img_path + self.determine_imagename(self.X_train[index]) + '/' + self.X_train[index]).convert('RGB')
        if self.transform is not None:
            img = self.transform(img)

        label = self.y_train[index]
        return img, label

    def __len__(self):
        return len(self.X_train.index)```

tom · September 16, 2018, 6:53am

You need to keep track of this mapping.

Best regards

Thomas