When I use
torchvision.datasets.ImageFolder on a large dataset (~120 000 images) with two folders (fake & real), the size of the total dataset pytorch uses is equal to the size of the real folder (exactly 68 850).
Since I output a csv file after training, I found that the number of fake images is exactly 30 000 (and no. of real images is 38 850). So
ImageFolder only uses a subset of the actual training dataset. Anyone has similar experiences, or advice to give to debug this? I’m using