Normalization of CIFAR and MNIST datasets

dlmacedo · March 2, 2017, 3:29am

The PyTorch Tutorial (https://github.com/pytorch/tutorials/blob/master/Deep%20Learning%20with%20PyTorch.ipynb) says that the CIFAR dataset needs normalize with 0.5 since we are getting PIL images:

# The output of torchvision datasets are PIL Image images of range [0, 1].
# We transform them to Tensors of normalized range [-1, 1]
transform=transforms.Compose([transforms.ToTensor(),
                              transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                             ])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, 
                                          shuffle=True, num_workers=2)

But the MNIST example (pytorch/examples/mnist) uses values very different from 0.5 to normalize the data:

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, 
                                          shuffle=False, num_workers=2)

datasets.MNIST('../data', train=True, download=True,
               transform=transforms.Compose([
                   transforms.ToTensor(),
                   transforms.Normalize((0.1307,), (0.3081,))
               ])),
batch_size=args.batch_size, shuffle=True, **kwargs)

Why are we using two different approaches to normalize MINIST and CIFAR? What is the correct approach to use?

I guess we are NOT normalizing the dataset in the first case, but only each image. If this is the case, I think we should clarify this in the PyTorch Tutorial.

Thanks,

David

smth · March 2, 2017, 3:32am

MNIST is not natural images, it’s data distribution is quite different.

BarryBA · April 26, 2019, 5:37am

Ok. My dataset is the natural images. How to define the mean value and std value? just use 0.5?
Moreover, can we set a parameter to make the CNN find the optimal parameter (mean value, std value or other value used in each channel) for the image processing? If so, can you tell me how to set the parameter?