I’ve implemented a specific dataset class for my purpose by inheriting Dataset object. It works properly. I’d like to take a very small subset of dataset, say 50, to see if my model overfits it successfully. Yet the data consist of many h5 files and json files, therefore changing it from my dataset class seems very hard and infeasible.
I tried manipulating the training file by using indexing. But that was not possible since Dataset object or enumerate object does not support indexing.I can provide additional info or code, if requested. The way I use Dataloader is:
for idx, batch in enumerate(dataloader_train):
...
# indices to draw samples from the dataset.
picks = np.random.permutation(20)
dataloader_train = DataLoader(
dataset,
batch_size=batch_size,
shuffle=False, # note that sampler and shuffle arguments are mutually exclusive
sampler=picks,
collate_fn=dataset.collate_fn
)