I have a list of indices of train_loader.dataset containing elements I want to keep. I want to selecct only these elements of the DataSet, then use the restricted DataSet in the DataLoader. Is there any way to do this?
Thanks for your reply. I’ve created a Subset object using the relevant indices and the original dataset. However, I’m not sure how to use this in my original DataLoader. Do I simply overwrite the original dataset?
i.e. train_loader.dataset = mysubset
or is there something more complicated that I must do?
I am confused about the Subset() for torch.dataset. I have a list of indices and a pytorch dataset (e.g. cifar). When I used the indices to get a subset from the dataset, the new subset.dataset still keeps the same length as the original dataset, even though when it is loaded into a dataloader, the length becomes correct.
I would like to find out a solution to check the length of a subset, and how to iterate the subset.
The underlying .dataset will not be changed and will keep its original size.
You can check the length of the Subset via len(subset) and iterate it with for data in subset.
Hi @ptrblck , I have a dataset defined, and I want to define a sampler that samples data points of batch size n such that the n indices are to be given by the user. How to achieve this?
I don’t fully understand your use case. Do you want the user to pass the batch indices in each iteration such that no sampling is used anymore or should these samples be somehow predefined by the user?