Torchvision.transforms.functional.normalize(tensor, mean, std)

Xiaoyu_Song · January 19, 2019, 3:43pm

Hi, I’m wondering this function

torchvision.transforms.functional.normalize(tensor, mean, std)

what does the mean and std represent?
Is it mean the current tensor’s mean and std?
In the tutorial

Loading and normalizing CIFAR10

The output of torchvision datasets are PILImage images of range [0, 1]. We transform them to Tensors of normalized range [-1, 1].


transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))

Question:

Here the (0.5, 0.5, 0.5) is the mean of the PILImage images of range [0, 1] and the (0.5, 0.5, 0.5) is the standard deviation of the PILImage images of range [0, 1], right?
In this case, how can we know the standard deviation of the original image?

Thank you in advance

vmirly1 · January 19, 2019, 4:20pm

The mean and std are the values that will be used in this equation: X' = (X-mean)/std.

It’s optional whether you want to calculate exactly the mean and std of your training data or just use these generic values of (0.5, 0.5, 0.5). But, it’s very common practice to just use generic values mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5). The purpose of using these values is to convert the range of pixel intensities from [0, 1] to [-1, 1].

Having pixel intensity range of [-1, 1] is specially important in case you have an autoencoder and you want to use Tanh() activation in your last layer as output. Because, Tanh() generates outputs in the range (-1, 1) and so the input to the autoencoder should also be in the same range.

Xiaoyu_Song · January 21, 2019, 5:48am

Hi, @vmirly1 thank you for your explanation
One more question, If I’m using Relu() as my activation function, should I convert my data in the range [-1,1] also?

vmirly1 · January 21, 2019, 2:35pm

Sure, no problem!

The concern was related to the activation in the last layer. The type of activation used in the intermediate layer do not affect this, and ReLU is always used in the intermediate layers.

BarryBA · April 26, 2019, 6:33am

Hi. can we set a parameter to make the CNN find the optimal parameter (mean value, std value or other weights/biases used in each image channel) for the image processing? If so, can you tell me how to set the parameter?