I鈥檓 bit new to using Iterable Datasets. I have been using torchdata to build my dataloaders but they seem to deprecate and even delete this functionality now . So, I want to build Iterable Dataset but it seems DistributedSampler cannot be used on iterable dataset. Any suggestions on how to use DDP on iterable Datasets?
I鈥檓 aware of this large issue: ChunkDataset API proposal by thiagocrepaldi 路 Pull Request #26547 路 pytorch/pytorch 路 GitHub
But I don鈥檛 think this functionality is added yet.
And also this post: Using IterableDataset with DistributedDataParallel
Here I don鈥檛 find any workable process or examples