Issues with torch.utils.data.random_split

Ashima_Garg · January 5, 2020, 10:07am

@ptrblck
My use case is to first divide the dataset into two different subsets, then for each subset,
Each subset should have the __getitem__ function such that, to load a batch of samples, the __getitem__ function to return pair of samples and these pair of samples belong to the same class, i.e. batch of 4 would mean a total of 8 samples. These are paired samples belonging to the same class.

Example: from MNIST Dataset, a batch would mean (1, 1), (2, 2), (7, 7) and (9, 9).

Your post on Torch.utils.data.dataset.random_split resolves the issue of dividing the dataset into two subsets and using the __getitem__ function for the individual subsets. But Can you help with the workaround of using index in __getitem__ to return the pairs from the same class.

Thanks.