# How to use custom image transformations with torchvision

My problem is fairly simple but I’m not sure if I’m doing it correctly. I will state what I’m doing so far and wish that someone will tell me if I’m mistaken or if I’m doing it correctly as I have not found a solution online.

I have coded an algorithm to make the “Shades of Gray” normalization of an image. I want this algorithm to be run on every image of my dataset. In order to do this I create a transforms.Compose. A snippet of the code would look like this:

``````import torchvision import transforms

transform = transforms.Compose([
transforms.RandomVerticalFlip(),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()
])

dataset = torchvision.dataset.ImageFolder(train_path, transform=transform)
``````

The module “color_constancy” is a self made script. The shades_of_grays method only accepts one image at a time but I suppose that the transform is done image-wise (again, correct me if I’m wrong) After this I just take the dataset into a dataloader and continue with the standard procedures.

Best regards.

Am

Have a look at these transform implementation, which you could use as a template for your custom transform. 3 Likes

Thank you for the confirmation. While trying to implement it I run to a problem. This shades_of_gray method has an image as an argument:

``````def shades_of_gray(img, power=6, extra=None):

# Parameters
# ----------
# img: 2D numpy array
#   The original image with format of (h, w, c)
# power: int
#   The degree of norm, 6 is used in reference paper

img_dtype = img.dtype

img = img.astype('float32')
img_power = numpy.power(img, power)
rgb_vec = numpy.power(numpy.mean(img_power, (0,1)), 1/power)
rgb_norm = numpy.power(numpy.sum(numpy.power(rgb_vec, extra)),1/extra)
rgb_vec = rgb_vec/rgb_norm
rgb_vec = 1/(rgb_vec*numpy.sqrt(3))
img = numpy.multiply(img, rgb_vec)

return img.astype(img_dtype)
``````

Thus when make the transform.Compose I get: TypeError shades_of_gray() missing 1 required positional argument: img.

I suppose the error is not tricky to solve but I can’t figure out what it is. From your link I suspect that I should make a class with call and repr methods but I don’t fully understand how I should do that.

Thanks to @ptrblck link I could figure out how to implement my transform. I had to make a few tweaks to transform from PIL image to numpy back and forth but so far it isn’t throwing errors anymore. In case this proves useful for anyone on the future (I know I’d have been that way for me) I will leave my final code below.

``````class shades_of_gray(object):

#     Parameters
#    ----------
#   img: 2D numpy array
#         The original image with format of (h, w, c)
#     power: int
#         The degree of norm, 6 is used in reference paper
#

def __call__(self, img):
"""
:param img: PIL): Image

:return: Normalized image
"""
img = numpy.asarray(img)
img_dtype = img.dtype

power = 6
extra = 6

img = img.astype('float32')
img_power = numpy.power(img, power)
rgb_vec = numpy.power(numpy.mean(img_power, (0, 1)), 1 / power)
rgb_norm = numpy.power(numpy.sum(numpy.power(rgb_vec, extra)), 1 / extra)
rgb_vec = rgb_vec / rgb_norm
rgb_vec = 1 / (rgb_vec * numpy.sqrt(3))
img = numpy.multiply(img, rgb_vec)
img = img.astype(img_dtype)

return Image.fromarray(img)

def __repr__(self):
return self.__class__.__name__+'()'
``````

If you see anything wrong or have any tips that I could follow, feel free to share them. Also, I don’t fully understand what does the “def repr(self)” line actually do.

3 Likes

Your code looks good. The `__repr__` method is used to print some information of the class, if you use `print(my_transform)`.
You could also remove it and just use the default Python implementation.
Other transform classes use it to print additional information about the passed arguments etc.

I am trying to add gaussian noise as part of the image transforms. I was able to add noise through a tensor. But I want to add noise through PIL Image data. How can I modify the below code block for the same?

``````class gaussianNoise():
def __init__(self, mean, stddev):
self.mean = mean
self.stddev = stddev

def __call__(self, tensor):
noise = torch.zeros_like(tensor).normal_(self.mean, self.stddev)
`PIL.Image`s are using numpy arrays under the hood, so you could create the array via:
``````arr = np.array(img)