Using index array to split dataset into train, validate, etc

Hi, I am trying to split my dataset into train, validate, and test dataset.
I’ve already used train_test_split to retrieve the index of my Dataset ImageFolder object. How can I retrieve the dataset directly instead of using function like Subsampler?
I’ll really appreciate it , if anyone could give me a hint.

I don’t know if it would fit your use case of “directly” splitting the dataset, but torch.utils.data.Subset would accept the dataset as well as the corresponding indices.
The reason to use this wrapper is to allow for lazy data loading (i.e. the actual samples are loaded and processed in the __getitem__ instead of preloading them in the __init__ to save memory and init time).

1 Like

Thanks, it help me to solve the problems!