Hi, I’m wondering this function
torchvision.transforms.functional.normalize(tensor, mean, std)
what does the mean and std represent?
Is it mean the current tensor’s mean and std?
In the tutorial
Loading and normalizing CIFAR10
The output of torchvision datasets are PILImage images of range [0, 1]. We transform them to Tensors of normalized range [-1, 1].
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
Here the (0.5, 0.5, 0.5) is the mean of the PILImage images of range [0, 1] and the (0.5, 0.5, 0.5) is the standard deviation of the PILImage images of range [0, 1], right?
In this case, how can we know the standard deviation of the original image?
Thank you in advance
The mean and std are the values that will be used in this equation:
X' = (X-mean)/std.
It’s optional whether you want to calculate exactly the mean and std of your training data or just use these generic values of
(0.5, 0.5, 0.5). But, it’s very common practice to just use generic values
mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5). The purpose of using these values is to convert the range of pixel intensities from
[0, 1] to
Having pixel intensity range of
[-1, 1] is specially important in case you have an autoencoder and you want to use
Tanh() activation in your last layer as output. Because,
Tanh() generates outputs in the range
(-1, 1) and so the input to the autoencoder should also be in the same range.
Hi, @vmirly1 thank you for your explanation
One more question, If I’m using Relu() as my activation function, should I convert my data in the range [-1,1] also?
Sure, no problem!
The concern was related to the activation in the last layer. The type of activation used in the intermediate layer do not affect this, and ReLU is always used in the intermediate layers.
Hi. can we set a parameter to make the CNN find the optimal parameter (mean value, std value or other weights/biases used in each image channel) for the image processing? If so, can you tell me how to set the parameter?