Dear community,
I’am relatively new to machine learning in general and pyTorch in particular. I just wanted to is short discuss something I encountered while implementing a custom DataSet class as a basis for my project which includes a simple classification (resnet34), object detection (Faster R-CNN) and instance segmentation (Mask R-CNN). I do some research in the medical domain and work with 10k images from the HAM10.000 dataset.
As data augmentation I perform some random transformations at the time images and masks are loaded. Those transformations include RandomRotations, RandomFlips, RandomCrop and some random ColorJitter. All of them are implemented in torchvision.transforms, however they are made for one input image. All random mutations of the image, e.g. rotations, should also be applied in the same way to the masks. At this point i decided to go with the given Structure of torchvision.transforms and implent some classes which inherit from those transforms but a) take image and masks and b) first obtain the random parameters and then apply the same transformation to both, the image and the mask.
What makes me think is that, this problem should not be unique to me, so i don’t understand why this is not implemented already. And also some projects I’ve came across perform data augmentation before the training process and only load already augmented datasets.
So, if there are better ways to do this, let me know.
Thank you all
As a reference i did something like that for all of my transformations i use:
class Compose(Compose):
def __call__(self, img, mask):
for t in self.transforms:
img, mask = t(img, mask)
return img, mask
class ColorJitter(ColorJitter):
def __call__(self, img, mask):
transform = self.get_params(self.brightness, self.contrast,
self.saturation, self.hue)
return transform(img), mask