According to the Pytorch official website, it is advised to use the following transform (normalisation as used for training under ImageNet):
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
I seen many scripts that uses pre-trained models provided by Pytorch and follow along with the recommendation of normalising according to the mean and standard deviation of ImageNet.
I do not understand why is it not recommended by Pytorch to normalise according to individual datasets considering that the purpose of normalisation is to bring the dataset to a standard normal distribution to ease the optimization process (gradient descent)? This is especially true for the purpose of fine-tuning where every weight and bias parameters is being changed according to a new dataset.
If we use ImageNet normalisation on our own dataset, it will not bring the our dataset to a standard normal distribution. Wouldn’t the effect of normalisation (following ImageNet) be the same as literally just subtracting random numbers that we pick out of thin air (from the perspective of our own dataset)?