Understanding transform.Normalize( )

Just note that you need to use your own mean and std if your dataset is not similar to ImageNet. In the 3-channel case you have mentioned, you are using mean and std from ImageNet which works for most of the datasets that are similar but if you are using datasets such as medical image processing, then you need to obtain proper mean and std regarding your own dataset.

4 Likes

@bhushans23 , @InnovArul
In this case we are transforming from [0,1] to [-1,1] using normalization.

Normalization usually however means to subtract each data point with the dataset mean, and then divide by the datasets standard deviation. In our case if you were to consider the dataset to be 11 numbers from [0,1], i.e (0.0, 0.1, 0.2, ....0.9, 1.0) its mean=0.5 but its stddev=0.316

We use transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]). That is mean=0.5, stddev=0.5 for all three channels.

Can someone please explain as to how exactly did we get to these numbers and what would one do if the image is in the range(0,255)?

Hi,

There is main difference here. If you use mean=0.5 and std=0.5, your output value will be between [-1, 1] based on the normalization formula : (x - mean) / std which also called z-score. This score can be out of [-1, 1] when used mean and std of dataset, for instance ImageNet.
The definition says that we need to use population mean and std but it is usually unavailable, sample mean/std can be used as an estimation.

4 Likes

thats helpful, thanks

I have seen that in training the MNIST dataset, we use transforms.Normalize((0.1307,), (0.3081,)),
My understanding is we calculate mean of dataset and subtract it from each image,

People directly use the values in their codes, but there is no calculation how these are derived.

Also is there is a way that we can automate this, instead of hard-coding, it calculates the mean of dataset and subtracts it.

Does using batch norm as the first layer of our network would have a similar effect?

1 Like

See here the answer by Sowmith himself. I hope thats what your are looking for Normalization in the mnist example

Is it correct or all values should be positive ?
mean=[-0.16160992, -0.09780669, 0.44261688]
std = [1.3066291, 1.3798972, 1.4423286]

It depends on your dataset, and if the mean of all samples is negative (which might be the case), then these values look alright.

EDIT: Just to avoid confusion: if you are working with images, which are using uint8 pixel values, the mean should be positive, since these values cannot get negative values. However, for any other dataset the mean might be whatever makes sense. :wink:

5 Likes

The transformation to [-1,1] is performed to keep values center around 0. This helps in faster convergence.

How about if the values not within range [-1, 1] after normalize, I check the max and min value and out of that range. My transformation:

transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=[0.35, 0.35, 0.35], std=0.12, 0.12, 0.12)])

Normalize does not necessarily create output values in the range [-1, 1], but the “standard score” as explained by @Nikronic in this post.

Do you need the output values to be in a specific range?

1 Like

Hi, I am a newbie in Machine learning. I still wonder why maximum is 1 and minimum is 0?

1 Like

The whole dataset has divided by 255.

1 Like

Is there a script or piece of code to run on my own images to get the corresponding values similar to below to pass to Normalize method?

transforms.Normalize(mean=[0.485, 0.456, 0.406],
                     std=[0.229, 0.224, 0.225]) 

You could use an approach posted e.g. in this thread.

1 Like

Thank you. Do you know why I am getting a tensor of 600 values instead of 3 values for train_loader? As for dataloader, I exactly get 3 values I get a tensor of 600 values instead of 3 values for mean and std

isnt the maximum value in each channel 255 ?
im assuming we didn’t divide each pixel by 255

It depends on how you want to apply transform.Normalize().
Usually, transform.ToTensor() will make the pixel values to be between [0, 1].
transform.Normalize() is supposed to work on these tensors.

1 Like

Thanks for your contribution.

In my case, I have CT images with min and max Hounsfield intensity values of -1024 and 3597 which I scaled to be within [0,1]. From the discussion so far, I realized that there is a need to normalize for better performance. My question is, do I need to do this for the validation and testing dataset? If yes, can I use the computed mean and standard deviation of the training dataset or I compute separately for the validation and testing dataset?

1 Like

I don’t think this generally applicable to any given grayscale/RGB channel images(s). You need to compute your mean and standard deviation of your dataset