I’m following the tutorial https://pytorch.org/tutorials/beginner/data_loading_tutorial.html and I did not know how to perform normalization of images that has values 0-255. I should create a custom Normalization transformation to normalize images for using to train the pretrained alexnet network.
Usually, the images are scaled to the [0, 1] interval first (images = images / 255)
. Then, to normalize them, you can use torchvision
's transforms.Normalize
and this is a link to the post where it’s explained how to compute the mean and std of the images.
using torchvision.transforms.ToTensor
you can:
Convert a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0] if the PIL Image belongs to one of the modes (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1) or if the numpy.ndarray has dtype = np.uint8
to normalize this you can then use torchvision’s transforms.Normalize
as mentioned by @mariosasko.
This lets you normalize your tensor using mean and standard deviation. The formular is image = (image - mean) / std.
Popular would be using 0.5 for all channels of both mean and std, since this would normalize your tensors in a range between -1 and 1 ( (0 - 0.5) / 0.5 = -1 and (1 - 0.5) / 0.5 = 1).