Loading VOC 2012 dataset with Dataloaders

VOC 2012 dataset consists of images and their corresponding segmentation maps. I want to apply similar transforms to both the image and its segmentation map while loading. Any suggestions about how to proceed for this task?

1 Like

We don’t have a ready solution implemented, but there has been some discussion in a torchvision issue.


@Gaurav_Pandey you can easily adapt a dataset to handle co_transforms in the __call__ function (e.g. see this gist which has general structure that handles co_transforms):

and here are some relevant affine transforms to actually use – you’ll see the transforms must take in two arguments for the input and target images:


Thanks guys. That was very helpful.

I also implemented a dataset for VOC2012.

@apaszke does pytorch have any plan to add VOC dataset like what it has for coco?

Now Torchvision has VOC dataset implemented, but it’s not that user friendly. Specifically, it doesn’t allow co-transform (data augmentation such as rotation and scaling) and separate transforms (normalization on images) to exist at the same time (raise an error here). Many good third-party implementations like this one.