Load Data and Train simultaneously on two datasets

Hi ptrblck, I am quite confused about how can we simultaneously load the datasets if both datasets are different?

Iā€™m not sure what ā€œdifferentā€ would mean in this context. In the previously described use case Iā€™ve suggested to load the image pairs in a single custom Dataset instead of creating multiple Datasets and loaders, which seems to have worked fine. Are you seeing any issues using this approach?

Even if you create your own indices, doesnā€™t SubsetRandomSampler still randomly permute the indices, leading to mismatch? (See below)

def __iter__(self) -> Iterator[int]:
    for i in torch.randperm(len(self.indices), generator=self.generator):
        yield self.indices[i]

No, since yield self.indices[i] is used, so you are randomly indexing the passed self.indices, which contain your designed subset.

I think the point I was trying to make was that, if you use SubSetRandomSampler on two datasets that you want randomly shuffled the same way, this will not work.

This sounds contradicting as you are either randomly shuffling or in a defined way. If you donā€™t want to randomly shuffle the datasets, you could use a plain Subset or write a custom Dataset and index the samples in your way.