MNIST normalizing and scaling the dataset at the same time

MauroCE · September 4, 2020, 6:14pm

I’ve looked everywhere but couldn’t quite find what I want. Basically the MNIST dataset has images with pixel values in the range [0, 255]. People say that in general, it is good to do the following:

Scale the data to the [0,1] range.
Normalize the data to have zero mean and unit standard deviation (data - mean) / std.

Unfortunately, no one ever shows how to do both of these things. They all subtract a mean of 0.1307 and divide by a standard deviation of 0.3081. These values are basically the mean and the standard deviation of the dataset divided by 255:

from torchvision.datasets import MNIST        
import torchvision.transforms as transforms 

trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True)
print('Min Pixel Value: {} \nMax Pixel Value: {}'.format(trainset.data.min(), trainset.data.max()))
print('Mean Pixel Value {} \nPixel Values Std: {}'.format(trainset.data.float().mean(), trainset.data.float().std()))
print('Scaled Mean Pixel Value {} \nScaled Pixel Values Std: {}'.format(trainset.data.float().mean() / 255, trainset.data.float().std() / 255))

This outputs the following

Min Pixel Value: 0 
Max Pixel Value: 255
Mean Pixel Value 33.31002426147461 
Pixel Values Std: 78.56748962402344
Scaled Mean: 0.13062754273414612 
Scaled Std: 0.30810779333114624

However clearly this does none of the above! The resulting data 1) will not be between [0, 1] and will not have mean 0 or std 1. In fact this is what we are doing:

[data - (mean / 255)] / (std / 255)

which is very different from this

[(scaled_data) - (mean/255)] / (std/255)

where scaled_data is just data / 255.

Henry_Chibueze · September 5, 2020, 11:17am

Well u don’t need to always normalize ur image data that way.

What I usually do is to set mine to be within range of [-1, 1] and here’s how I do it:
I divide all by 255 so it fall within the range of 0 and 1 and then I use a mean and sd of 0.5 like this (X - 0.5)/ 0.5 to normalize it the the target range. Similar too u can just normalize it directly to range[-1, 1] by using X = (X - 127.5)/127.5.
If u apply this to an image and still try to load it it will give u the same exact visual representation as when it wasn’t normalized so it’s optional to reverse the normalization at output.

Hope this helps you.