Augmenting/modifying the torchvision.datasets?

lmnt · February 5, 2018, 3:45pm

Is it possible to read in the data from one of the torchvision datasets, say MNIST, and then insert more data and labels that I have generated as numpy arrays, and still have the dataloader work as is, just now with more data?

Or would it be best to just make a new dataset class that has extracted the mnist images as some numpy format where I can concatenate with my custom images?

Thanks!

christianperone · February 5, 2018, 4:29pm

Have you looked at ConcatDataset documentation ?

I think it would be the best approach, or you can extend the MNIST dataset class and concatenate your data after the constructor, so you’ll have to join your data with current loaded MNIST data.

lmnt · February 5, 2018, 8:13pm

Great, thank you, this is perfect.

I didn’t realize ConcatDataset was what I needed, but after concatenating the MNIST data with my modified data, torch.utils.data.DataLoader works fine on the new combined dataset.