Subset Dataloader

Hello, I’d like to ask about data loader.
The data I’m using is very big, so I’d like to experiment with subsets and compare with multiple baselines for comparing with my model quickly. Would it be possible to experiment with subsets simply by reducing the length in the loader (i.e__len__: return {original_data_length}//10) for applying identical subset datasets to the multiple training models?

All maybe… it would be appreciated if you tell me the tips for comparing models with huge datasets quickly for not being waste of my time.

Yes, you can reduce the len of the Dataset, which would then create samples using index values in the __getitem__ method in [0, len(dataset)-1].

1 Like