Understanding transform.Normalize( )

Rui_Li · October 17, 2019, 3:17am

It really helps, thanks~

snip3r77 · January 2, 2020, 7:29am

So if we encounter grayscale images, we will use

transforms.Normalize([0.5], [0.5])

and if we encounter RGB( 3 channels ) , we will use the following

transforms.Normalize(mean=[0.485, 0.456, 0.406],
                     std=[0.229, 0.224, 0.225])

Hong_Cheng · January 6, 2020, 2:29am

here is my implementation of the custom ToTensor() and Normlization() method, which is called ToTensor_Norm() .

class ToTensor_Norm(object):
    """Convert ndarrays in sample to Tensors."""
    def __init__(self, mean, std):
        self.mean = mean
        self.std = std

    def __call__(self, sample):
        image, pls, tr_angles, positions = \
            sample['image'], sample['pls'], sample['tr_angles'], sample['positions']
        image = image[np.newaxis, :, :]
        # swap color axis because
        # numpy image: H x W x C
        # torch image: C X H X W
        # image = image.transpose((2, 0, 1)) # single channel
        image = torch.from_numpy(image)
        for t, m, s in zip(image, self.mean, self.std):
            t.sub_(m).div_(s)

        return {'image': image,
                'pls': torch.from_numpy(pls),
                'tr_angles': torch.from_numpy(tr_angles),
                'positions': torch.from_numpy(positions)
                }



transformed_dataset = PLDataset(csv_file=csv_file,
                                    root_dir=root_dir,  
                                    transform = transforms.Compose([ToTensor_Norm([5.50180734], [8.27773514])]),
                                    # transform=transforms.Compose([ToTensor()])
                                    )

rajanala · April 2, 2020, 3:37am

I was also having same doubt…i am learning pytorch . Normalise depends on the number of channels. if MNIST its grey scale with only one channel . so you can do …transforms.Normalize((0.5,), (0.5,))… If three channel, you may need to specify for all channels for example : CIFAR10.

achyut-srivastava · June 16, 2020, 6:09pm

You have to divide by 255 and then you can proceed with transforms.Normalize()

Nikronic · July 8, 2020, 2:21pm

Just note that you need to use your own mean and std if your dataset is not similar to ImageNet. In the 3-channel case you have mentioned, you are using mean and std from ImageNet which works for most of the datasets that are similar but if you are using datasets such as medical image processing, then you need to obtain proper mean and std regarding your own dataset.

SiddharthSingi · July 17, 2020, 1:11pm

@bhushans23 , @InnovArul
In this case we are transforming from [0,1] to [-1,1] using normalization.

Normalization usually however means to subtract each data point with the dataset mean, and then divide by the datasets standard deviation. In our case if you were to consider the dataset to be 11 numbers from [0,1], i.e (0.0, 0.1, 0.2, ....0.9, 1.0) its mean=0.5 but its stddev=0.316

We use transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]). That is mean=0.5, stddev=0.5 for all three channels.

Can someone please explain as to how exactly did we get to these numbers and what would one do if the image is in the range(0,255)?

Nikronic · July 17, 2020, 7:54pm

Hi,

There is main difference here. If you use mean=0.5 and std=0.5, your output value will be between [-1, 1] based on the normalization formula : (x - mean) / std which also called z-score. This score can be out of [-1, 1] when used mean and std of dataset, for instance ImageNet.
The definition says that we need to use population mean and std but it is usually unavailable, sample mean/std can be used as an estimation.

SiddharthSingi · July 20, 2020, 1:21pm

thats helpful, thanks

nshrimali · July 23, 2020, 12:44pm

I have seen that in training the MNIST dataset, we use transforms.Normalize((0.1307,), (0.3081,)),
My understanding is we calculate mean of dataset and subtract it from each image,

People directly use the values in their codes, but there is no calculation how these are derived.

Also is there is a way that we can automate this, instead of hard-coding, it calculates the mean of dataset and subtracts it.

Does using batch norm as the first layer of our network would have a similar effect?

Rishi · July 28, 2020, 1:52am

See here the answer by Sowmith himself. I hope thats what your are looking for Normalization in the mnist example

monster · August 2, 2020, 7:55pm

Is it correct or all values should be positive ?
mean=[-0.16160992, -0.09780669, 0.44261688]
std = [1.3066291, 1.3798972, 1.4423286]

ptrblck · August 3, 2020, 2:40am

It depends on your dataset, and if the mean of all samples is negative (which might be the case), then these values look alright.

EDIT: Just to avoid confusion: if you are working with images, which are using uint8 pixel values, the mean should be positive, since these values cannot get negative values. However, for any other dataset the mean might be whatever makes sense.

nik1806 · August 14, 2020, 8:00am

The transformation to [-1,1] is performed to keep values center around 0. This helps in faster convergence.

said_rasidin · August 16, 2020, 5:47pm

How about if the values not within range [-1, 1] after normalize, I check the max and min value and out of that range. My transformation:

transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=[0.35, 0.35, 0.35], std=0.12, 0.12, 0.12)])

ptrblck · August 18, 2020, 8:09am

Normalize does not necessarily create output values in the range [-1, 1], but the “standard score” as explained by @Nikronic in this post.

Do you need the output values to be in a specific range?

Hieu_Nguyen · August 24, 2020, 9:14am

Hi, I am a newbie in Machine learning. I still wonder why maximum is 1 and minimum is 0?

yongen9696 · September 17, 2020, 8:13am

The whole dataset has divided by 255.

Mona_Jalal · October 14, 2020, 1:03am

Is there a script or piece of code to run on my own images to get the corresponding values similar to below to pass to Normalize method?

transforms.Normalize(mean=[0.485, 0.456, 0.406],
                     std=[0.229, 0.224, 0.225])

ptrblck · October 14, 2020, 6:52am

You could use an approach posted e.g. in this thread.