Torchdata next(iter) datapipe?


I have been using the new torchdata datapipes recently. I only just noticed when I moved from a small subset of training data to training on the entire dataset each batch was taking a considerably longer time to load.

I did some testing and realised each time I call next(iter) on the datapipe it is iterating through the entire dataset before outputting one batch. This is extremely slow as it is streaming data and doing heavy operations on each item in the dataset.

I am therefor not actually sure on how next(iter) is working on iterdatapipes and would like some guidance on proper use.