Normalization is important?

I have seen multiple implementations where they have applied normalization in transforms (mean and std).

My question is in respect of images.

Since ToTensor() already converting each pixel in [0-255], why to normalize.

This is the type question that I loved the PyTorch forum for in 2018. (But a small thing first: ToTensor converts to 0…1)

On the face of it, we might except that it isn’t needed: Typically the first thing that happens to images is a convolution and you could just adjust weights and bias to make it take 0…255 pixel values instead of “zero mean, unit variance” (over the entire dataset). So the neural network could easily represent the transformation.

One part might be tradition, if you look at the AlexNet paper, you see that they center the images. In a way this standardization is an easy approximation to a Whitening transformation which have a history in statistics. Quite likely you will see a (%-pt sized maybe?) drop in accuracy if you remove it.

That said, one of the crucial issues in deep learning is the mathematical stability of the network and the ability to train the network (with mathematical stability of the gradient being one of the key issues). From famous initialization strategies (and our chit chat here) to BatchNorm and friends to recent work on normalization free training to adaptive optimizers, standardization (to mean zero / unit std as a reference) of activations, gradients, and weight updates has been one of the great underlying themes in training neural networks. Starting with mean zero / unit std inputs would then be a natural starting point.

That said, I do think that doing the normalization as part of the data preparation is not ideal conceptually, and I would really view it as part of the model instead of as part of the data preprocessing.

Best regards



Oh I mistyped that one, yes, [0-1]