Hi there,
The documentation for ConcatDataset & ChainDataset isn’t exactly clear (at least to me).
If I wanted to have a dataloader that creates batches of:
- A matrix
- A vector
- A label
Does ConcatDataset allow me to iterate over the matricies and the vector from 2 different datasets simulatenously? And the shuffle will return the corresponding matrix and vector?
Many Thanks
ChainDataset
is used for IterableDatasets, while ConcatDataset
is used for the map-style datasets.
No, as ConcatDataset
will concatenate the passed datasets and won’t yield the samples simultaneously.
You could zip
the DataLoaders
and iterate them together:
dataset1 = TensorDataset(torch.zeros(10, 1), torch.zeros(10, 1))
dataset2 = TensorDataset(torch.ones(10, 1), torch.ones(10, 1))
loader1 = DataLoader(dataset1, num_workers=2, batch_size=2)
loader2 = DataLoader(dataset2, num_workers=2, batch_size=2)
for (x1, y1), (x2, y2) in zip(loader1, loader2):
print(x1, y1)
print(x2, y2)
Hi Piotr,
Many thanks for the thorough reply I have zipped 2 loaders together, and your continual replies all over the forum they’re a great help to a huge number of people.
1 Like