Dataset : IndexError: list index out of range

Hi,
I created custom dataset for classification of 8 channel images.
I get an IndexError: list index out of range when I lanch the trainning.
Do you see errors on my dataset?

LABEL_FILENAME="label.tif"

class Dataset(torch.utils.data.Dataset):
    def __init__(self, data_dir,):
        self.data_dir = data_dir#os.path.join(root_dir)
        self.samples = list()
        self.classes = [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]

        files = os.listdir(self.data_dir)
        for f in files:
            self.samples.append(f)

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):

        path = os.path.join(self.data_dir, self.samples[idx])
        print("p",self.samples[idx])
        label, profile = read(os.path.join(path,LABEL_FILENAME))
        label=label[0]

        label = torch.from_numpy(label)

        x = read(os.path.join(path, "MHCHD_22E01N.tif"))[0]
        x= np.array(x)
        x =torch.from_numpy(x)


        return x.float(), label.long()

Which line of code is throwing this error?
Could it be that indexing label is not working or the x = read(...)[0] call?

Hi,
Here the error :

File "/home/Projet/mydataset.py", line 58, in __getitem__ path = os.path.join(self.data_dir, self.samples[idx]) IndexError: list index out of range

Could you print the length of self.samples and the idx, which is causing this error?

Thanks for your help ptrblck!!

I get the name of the directory. In each directory I have 1 image (“MHCHD_22E01N.tif”), and one label image (label.tif).

print(self.samples[idx]) : dalle_238
print(self.samples[idx]) : dalle_268
print(self.samples[idx]) : dalle_195

Could you print the idx and try to index samples with this index outside of the training loop?

Hi,
I’m not sure to understand what you mean by " try to index samples with this index"

If I print idx outside the training loop I get :slight_smile:
`

idx : [0]
self.samples : [‘dalle_741’, ‘dalle_321’, ‘dalle_444’,(…)‘dalle_532’]
len(self.samples) : 973

Sorry for not being clear enough. :slight_smile:

Print the idx in the training loop and note which idx is causing the error.
Then use this particular value to check, if dataset.samples[idx] exists or what should be at this position in the list.

Ho thank you very much!
I found my error!
It was during the split of my datasets into batches!