Error when enumerating dataloader

When enumerating over my dataset I get this error:

mask_path = self.mask_files[idx]
IndexError: list index out of range

from this line:

for index, (tiles, labels) in enumerate(loader):

I tried with other datasets it is working well. The problem seem to come from a subsetting I am doing when loading the dataset to gain time. When I use this dataset I get this issue. Maybe it comes from that but I do not know in what way.

class GreenhouseDataset(Dataset):
def init(self, folder_data):

    # lists to store the data
    self.mask_files = []
    self.img_files = []
    
    # getitng the list of files
    list_files = fun.get_files(folder_data)
    list_files = random.sample(list_files, len(list_files)//10) 
    
    # extracting the data
    for filename in list_files:
           self.mask_files.append(filename)
            self.img_files.append(filename)

PS: its my first post on this forum

One way to get this error could be to have the dataset claim a length larger than the list you are indexing.

Best regards

Thomas

PS Enclosing your code in triple backticks ``` will give it code formatting.

PPS Welcome here!

1 Like

thanks for your comment.
I am not sure I completely follow you on your advice, the error comes at the first element from the for loop, as if there was no first element.
Strangely, when I tried to run it outside of my train function it was working.
I checked everything again but the dataset is not altered in the process so I dont understand why I get this error inside the function

You could just print the idx in your getitem and also compare to len(ds) and len(ds.mask_files).

1 Like

I found the issue.
When I was subsetting the data I was doing it on the whole folder so there was an unbalance between masks and images