Problem with extract label from my dataset

nima_pw · February 4, 2023, 2:09pm

Hi.
I have an image dataset with 35 classes, all the images are in one folder, and one part of the names of the images is their label. An example of image names is like this:
D34_Samsung_GalaxyS3Mini-images-flat-D01_I_flat_0001.jpg
And the label becomes D01 here.

In the Dataset class definition, the target variable should return the image label, right? If we consider the index, 34 should be returned for this example.

I have a code to define the dataset:

class MyDataset(Dataset):
    
    def __init__(self, imgs , transform = None):
        self.imgs = imgs
        self.transform = transform or transforms.ToTensor()
        self.class_to_idx = {}

    def __getitem__(self, index):
        
        image_path = self.imgs[index]
        target = image_path.split('_')[0]
        target = re.findall(r'D\d+.+' , target)
        
        image = Image.open(image_path)
        
        if self.transform is not None:
            image = self.transform(image)

        if target[0] in self.class_to_idx : 
            target = [self.class_to_idx[target[0]]]
        else : 
            self.class_to_idx[target[0]] = len(self.class_to_idx)
            target = [self.class_to_idx[target[0]]]

        return image , target
    
    def __len__(self):
        return len(self.imgs)

But when I tested it, I realized it does not extract the labels correctly. That is, every time, the labels are always a number between 0 and 15 (batch-size=16).
There are 35 classes, but the target is always between 0 and 15. that is, the batch size; also, an image may get a different label each time the code is executed!

An output of the above code is shown in pic No.1

So I changed the Dataset code. I removed a few lines of code and directly obtained the label from the name of the images instead of using class_to_idx:

class MyDataset(Dataset):
    
    def __init__(self, imgs , transform = None):
        self.imgs = imgs
        self.transform = transform or transforms.ToTensor()
        self.class_to_idx = {}

    def __getitem__(self, index):
        
        image_path = self.imgs[index]
        target = image_path.split('_')[0]
        target = target.split('D')[1]
        target = int(target)
        
        image = Image.open(image_path)
        
        if self.transform is not None:
            image = self.transform(image)

        return image , target
    
    def __len__(self):
        return len(self.imgs)

When I did the test, the numbers were no longer between 0 and 15, and there were real labels of images. (Pic No.2)

My problem is that when I train the model with the first code, my CNN model trains correctly and does not give an error.
But by the second code (my edition), even though the output was correct in the test, the model cannot train and errors (Pic No.3)

Whatever I search, the answers I see are related to the model. But there is no problem with the model, and it does not give an error with the first code.
Thank you for your advice.

ptrblck · February 4, 2023, 8:45pm

Your target tensor seems to be empty as seen in this code:

criterion = nn.CrossEntropyLoss()
output = torch.randn(16, 10, requires_grad=True)
target = torch.randint(0, 10, (0,))
print(target)
# tensor([], dtype=torch.int64)

loss = criterion(output, target)
# ValueError: Expected input batch_size (16) to match target batch_size (0).

Check why that’s the case and make sure it contains the expected number of elemens.

nima_pw · February 6, 2023, 7:52am

I changed the code as below, and my problem was solved:
There was a need to return “target” as a dictionary.

class MyDataset(Dataset):
    
    def __init__(self, imgs , transform = None):
        self.imgs = imgs
        self.transform = transform or transforms.ToTensor()
        self.class_to_idx = {}

    def __getitem__(self, index):
        
        image_path = self.imgs[index]
        target = image_path.split('-')[0]
        label = target.split('_')[0]
        label = label.split('D')[1]
        name = target
        
        image = Image.open(image_path)
        
        if self.transform is not None:
            image = self.transform(image)

        if target in self.class_to_idx : 
            target = [self.class_to_idx[target]]
        else : 
            self.class_to_idx[target] = (int(label)-1)
            target = [self.class_to_idx[target]]

        return image , target
    
    def __len__(self):
        return len(self.imgs)