Accessing ChainDataset

Hi,

I follow that accessing ConcatDataset data (say, for training) can be done as follows:
for i, (x,y) in enumerate(DataLoader(train, batch_size, shuffle=True, num_workers=5)):
pass

I am not able to follow how to follow a similar approach on an IterableDataset joined with ChainDataset. I have an instance of ChainDatset but I am not sure how to iterate over to access the individual elements.

Also, I get the following error:
iter() returned non-iterator of type ‘tuple’

Thanks!

For IterableDataset, you don’t need a DataLoader because it’s iterable. See an example here

@zhangguanheng66

Thank you for your reply.

Here is a challenge. say, I have 10 PyTorch training tensor files (each ~ 3GB). I cannot load them all at once into the CUDA memory. So, I was thinking of using ChainDataset by creating Iterable Dataset for each of the 10 train datasets. Now the challenge is how to iterate over an object of Chain dataset class?

Thanks!

It’s part of personal choice. One thing I would investigate is itertools.chain.from_iterable([dataset1, dataset2])