Modify the dataset

I have a data directory with images and using different way I can convert my data into an array or list dataset with label for tensorflow to train a model with my data set. But I am doing clustering using torch with tensor dataset for MNIST data set. Now to use my model I want to convert my collections of images( for examples dogs and cats dataset) into a data set like MNIST built in datasets. How can I do that?

If your data and targets are already tensors you could just pass them to torch.utils.data.TensorDataset and iterate it directly or via a DataLoader (which would batch the samples for you, shuffle them etc.).
On the other hand if you want to apply transformation to each sample and e.g. lazily load the data you could write a custom Dataset as described here.

1 Like