Greetings.
I have been playing with the Oxford Flowers102 dataset with some diffusion script.
The training was shorter than the code base I used as reference, and after checking the dataset length, it turned out that in torchvision (v3.6.0), the split is as follows:
Train: 1020 samples
val: 1020 samples
Test: 6149 samples
while in Hugginface’s dataset, it is
Train: 7169 samples
Test 1020 samples
I would have assume the train split takes most of the samples, so I was wondering if there was any specific reason for this data split ?
Thank you for your time.