Extending torchvision for handling own datasets

Hi there,

I’m very new to PyTorch so please bear with me. I’ve been following and reading tutorials to get familiar with pytorch. The tutorials all use torchvision package which contains dataloaders for CIFAR-10/100, COCO etc. I wanted to know if torchvision’s functionality can be extended to any non-standard dataset that I may have.

If not, then can one write their own custom dataloaders and still use the transforms features defined in torchvision?


Of course, as long as you write your own Dataset which is very easy to implement. Then you can utilize the speedup of multiprocessing by using Dataloader

You may refer to Imagefolder it’s a standard implementation of Dataset.

Thanks, just one more question - Does this class only support image data as of now or it can be used without any modifications in cases like text data for training RNNs.

It supports all kind datasets, and it could also be used to load raw text file.But you need to write your own loader(read file into memory) and transform(transform text data to tensor).
As for text datasets, try:

Thanks for helping a beginner out!