Normalization of input image

Jake_Pan · January 16, 2019, 2:29pm

I am a beginner to pytorch here. As I read the tutorial, I always see such expression to normalization the input data.

transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

However, if I understand correctly, this step basically do

input[channel] = (input[channel] - mean[channel]) / std[channel]

according to the documentation.

So the question is, in order to normalize an input image, why we just say the mean and std is 0.5 and 0.5? Shouldn’t we first calculate the mean & std of one whole image and then use that value to normalize? I mean, it doesn’t have to be the case that every image has the same mean and std. Do I miss something here?

Any comments and idea are highly appreciated! Thank you!

JuanFMontesinos · January 16, 2019, 2:46pm

Well, yeh. You have to compute it or find it. For imagenet you can just google those numbers but if you work with a custom dataset it’d be good you to compute it.

Jake_Pan · January 16, 2019, 3:28pm

Sorry, I am a beginner here. Maybe I am asking silly questions. Could you please tell me how to pass the sample-dependent mean and std to the transforms.Normalize(mean=, std=)]? Because what I find is all like assigning a fixed value for it. But how can I pass the specific data point and its np.mean() and np.std() to the transforms.Normalize?

JuanFMontesinos · January 16, 2019, 3:45pm

Oh sorry for the confusion. It’s not sample-dependent but dataset-dependent. For example here you can find the parameters for imagenet. In short people computes an average of all the images of the dataset (assuming you are working with pictures and no other kind of data)

github.com

pytorch/examples/blob/master/imagenet/main.py#L195-196


        print("=> loaded checkpoint '{}' (epoch {})"
              .format(args.resume, checkpoint['epoch']))
    else:
        print("=> no checkpoint found at '{}'".format(args.resume))


cudnn.benchmark = True


# Data loading code
traindir = os.path.join(args.data, 'train')
valdir = os.path.join(args.data, 'val')
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])


train_dataset = datasets.ImageFolder(
    traindir,
    transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        normalize,
    ]))

in lines 195-196 provided for torchvision models.

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

You may find this interesting

Jake_Pan · January 16, 2019, 3:47pm

Oh, I see!
Yes, that’s my mistake. Thank you!

BarryBA · April 26, 2019, 6:21am

Hi. May I ask, how to define the mean value and std value for each image channel? My dataset consists of the natural staellite images, just use 0.5? Or compute the mean and std value for each channel of each image?
Moreover, can we set a parameter to make the CNN find the optimal parameter (mean value, std value or other weights/biases used in each channel) for the image processing? If so, can you tell me how to set the parameter?

JuanFMontesinos · April 26, 2019, 11:57am

Hi,
You should compute mean and std for each channel for all the images in the dataset (or a representative subset if your dataset is very big).

Setting that preprocessing as a learnable parameter is not very practical. It’s just an statistical normalization.

In the worst case you can set everything to 0.5, which is an approximation

BarryBA · April 29, 2019, 2:34am

Thank you very much. I have tested the results between different image processing strageies, i.e. set all the mean and std value to 0.5, or compute mean and std for each channel for all the images in the dataset, the results indicate that the 0.5 setting is more conductive to the improvement of final accuracy.

Hong_Cheng · January 17, 2020, 1:48am

should we always do the input image normalization?

JuanFMontesinos · January 17, 2020, 10:11am

It is strongly recommended. Networks may be able to fit any range of values but it’s been proved that normalization improves performance.

Alec_young · October 27, 2020, 7:51am

May I know how you are able to calculate the mean and standard deviation for the datasets?

JuanFMontesinos · October 27, 2020, 8:54am

Hi,
First of all, if the dataset is “huge” you pick a representative subset.
Then you just need to iterate over that set and:
-Ideally you would save all the pixel values (per channel) and compute the mean and std of those.

Since this may be memory demanding you can just save the mean of each image (per channel). If all the images are the same size, it is equivalent to the previous formulation. Otherwise it’s an approximation.

Once you have the mean you can compute the std in the analogous way.