dataset.ImageFolder outputs different length of images

When I imported the data in dataset.ImageFolder, I tried to print the length of the data, the length is supposed to be 54000 image but when I run print(len(data)) it outputs different number and when I run it again it output another totally different number, does anyone know why it doesn’t shows the exact length of the dataset? does it have limit for numbers of images??

Could you post the code you are using to create your Dataset?
Also some information about the folder structure would be interesting to see.

import torchvision.transforms as transforms
from torchvision import datasets

transforms = transforms.Compose([transforms.RandomRotation(30),
transforms.Resize(30),
transforms.ToTensor(),
transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))])

Arabic_train_data = datasets.ImageFolder(Arabic_train_dataPath, transform = transforms)
Arabic_test_data = datasets.ImageFolder(Arabic_test_dataPath, transform = transforms)

here is the code I’ve used, the data path has 10 folders of arabic handwritten numbers each folder contain 6000 bmp image file

every time I run

print(len(Arabic_train_data))

it outputs wrong length of the dataset and it gets increasing everytime I run it

although the test data length is 10000 as it should be

So if you run these command sequentially, you’ll get different results?

Arabic_train_data = datasets.ImageFolder(Arabic_train_dataPath, transform = transforms)
print(len(Arabic_train_data))
print(len(Arabic_train_data))
print(len(Arabic_train_data))
print(len(Arabic_train_data))

at the same time no, but if there were a delay maybe about 2 mins the value increases, as if it keeps loading the data in the ImageFolder by the time

and another weird thing, the length now exceeded the number of images in the dataset, it reached 56k and it supposed to be 54k

Are you working on a shared drive or are you moving data around?
This seems really weird, as the samples and targets are created in the __init__ call.
So after initializing the ImageFolder, the length should be constant even if you add new files to the folders.
Or are you reinitializing the dataset also?

I am working on google colab and connecting it to my google drive account

Still the lendth shouldn’t be changed once the dataset is initialized, so are you reinitializing it somewhere in your code?