Why my Custom Dataset is not working?

Hello,
I have some images in a folder. So, I am trying to create a custom dataset with taking help from this post. But, I am getting some errors.

My custom dataset class is given below:

class CustomDataSet(Dataset):
    def __init__(self, main_dir, transform):
        self.main_dir = main_dir
        self.transform = transform
        all_imgs = os.listdir(main_dir)
        self.total_imgs = natsort.natsorted(all_imgs) //Error-1

    def __len__(self):
        return len(self.all_imgs)

    def __getitem__(self, idx):
        img_loc = os.path.join(self.main_dir, self.all_imgs[idx])
        image = Image.open(img_loc).convert("RGB")
        tensor_image = self.transform(image)
        return tensor_image

My loader method

 train_data_dir = '/home/Houses-dataset-master'

  # Transformation
  transform = transforms.Compose([
  transforms.Resize((256, 256)),
  transforms.ToTensor(),
  transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

  # Giving the path 
  train_data_tensor = CustomDataSet(train_data_dir, transform=transform)

   # trying to  print the length of the train_data_tensor
   print(len(train_data_tensor)) //Getting Error-2

    # Converting labels into tensoe
    train_label_price_tensor = torch.tensor(label_price.values)
    print(train_label_price_tensor.size())
    print(train_data_tensor.size()) //getting Error-3

First I was getting an error for natsort

     Error-1: NameError: name 'natsort' is not defined

Then, I remove the natsort and replaced all total_imgs with all_imags. Getting

Error-2: AttributeError: 'CustomDataSet' object has no attribute 'all_imgs'

Finally, I removed the error-2 code line from the program and getting another error at the time of printing the size of my images tensor.

Error-3: AttributeError: 'CustomDataSet' object has no attribute 'size'

Does it mean, my custom dataset is not working? Transformation is not working?

How can I solve these issues?

Thank you.

You’ve missed the self for all_imgs.

all_imgs = os.listdir(main_dir) to self.all_imgs = os.listdir(main_dir).

Try out that first. It should fix it.

@pchandrasekaran Thank you very much.

Your suggestion solve the Error-2 but, still I am getting Error-3

Error-3: AttributeError: 'CustomDataSet' object has no attribute 'size'

Could you tell me, why this is happening (transformation is not working)?

That’s because the Dataset class doesn’t contain any implementation for size(). Do you need the size of the entire dataset as in [total_len, C, H, W] or are you simply trying to get the size() of the returned value?

@pchandrasekaran thank you,

Yes, I was just simply trying to find the length of the dataset. Because I am using ToTensor at the time of transform. It means, my dataset is converting into Tensor formate, right? If ToTensor worked then, I should get the size, right?

However, I need to add data_tensor and label_tensor both at the time of dataloader. Code is given below:

    # Making tensor using both image and labels tensor
    train_tensor = TensorDataset(train_data_tensor, train_label_price_tensor)
    
    # Making Train Dataloader
    train_loader = DataLoader(train_tensor, batch_size= 1, 
                              num_workers= 2, shuffle= True)

I am getting this error

AttributeError: 'CustomDataSet' object has no attribute 'size'

Could you tell me, how can I solve this problem?

Dataset object does not have the attribute size() as the error suggests.
To get the data out of Dataset object, you should use an index.
For example,

img = train_data_tensor[0]
print(img.size()) # or print(img.shape)

@InnovArul Thanks,

Then, how can I make a dataloader using this dataset? I am using this code

# Making tensor using both image and labels tensor
train_tensor = TensorDataset(train_data_tensor, train_label_price_tensor)

# Making Train Dataloader
train_loader = DataLoader(train_tensor, batch_size= 1, 
                          num_workers= 2, shuffle= True)

But getting this error

AttributeError: 'CustomDataSet' object has no attribute 'size'

Could you tell me, how can I solve this problem?

@InnovArul, after applying your suggestions, I am getting the shape of a image.

It is confusing to me at this point.
I see that you have own dataset class CustomDataset. I also see the use of the TensorDataset.

You can use any object of a Dataset class as input to Dataloader.

Here, train_data_tensor, train_label_price_tensor are tensors? If there are tensors, I am not sure why the error talks about CustomDataSet.

@InnovArul, I am using transform in my CustomDataSet. So, I am guessing that the code converting the dataset into tensor (I am not sure).

train_label_price_tensor is tensor. I made it using

train_label_price_tensor = torch.tensor(label_price.values)

Is it possible to create a CustomDataSet into Tensor?

TensorDataset accepts tensors as input. So, CustomDataset can’t be its parameter.
Why can’t you pass labels to custom dataset and return image, label in __getitem__?
and then pass the CustomDataset directly to dataloader?

@InnovArul thanks.

Right now, I have done your suggested method (images and labels) from the CustomDataSet.