VOC 2012 dataset consists of images and their corresponding segmentation maps. I want to apply similar transforms to both the image and its segmentation map while loading. Any suggestions about how to proceed for this task?
We don’t have a ready solution implemented, but there has been some discussion in a torchvision issue.
@Gaurav_Pandey you can easily adapt a dataset to handle co_transforms in the __call__
function (e.g. see this gist which has general structure that handles co_transforms):
and here are some relevant affine transforms to actually use – you’ll see the transforms must take in two arguments for the input and target images:
Thanks guys. That was very helpful.
Now Torchvision has VOC dataset implemented, but it’s not that user friendly. Specifically, it doesn’t allow co-transform (data augmentation such as rotation and scaling) and separate transforms (normalization on images) to exist at the same time (raise an error here). Many good third-party implementations like this one.