Normalization in the mnist example

taiky · March 7, 2018, 4:47pm

@lkins, @smth
why you guys said [-1,1]? From the document, I just see [0,1]
http://pytorch.org/docs/master/torchvision/transforms.html

class torchvision.transforms.ToTensor

Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].

So if I do the normalization on each channel by myself, to convert [a,b] to [0,1], I don’t need transforms.ToTensor anymore, right?

But what if my data has a different range of each channel, such as x: -10 ~ 10, y: 1 -100, z: 20 -25 (actually they have some hidden correlation between each other), how to normalization? It doesn’t make sense to normalize them to the same range.

LJ_Mason · March 20, 2018, 1:23am

So the imagenet’s parameter
mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]
can also be used for cifar10 dataset’s normalization?

nesrine · March 22, 2018, 3:19pm

This normalization can also applied on one channel image (gray image)?

jdhao · March 23, 2018, 12:23pm

I do not think so, for gray image, maybe you can just use 0.5 for its mean and 0.5 for its std.

Maverick · April 27, 2018, 1:40am

Is it necessary to normalize the data? I’m just curious about two cases:

If you don’t normalize the data
If you don’t know the mean and std and just use 0.5 for all values.

anshumanvenkatesh · May 15, 2018, 3:00pm

Can you please these explanations as probably a footnote in the tutorials? In its current form it seems too intimidating to see constants popping without proper explanation. Great work BTW.

Rami_Awar · July 15, 2018, 9:45am

@smth Why should they be in [-1, 1] range? How does that help the network?

Deeply · July 22, 2018, 11:22am

Rami_Awar · July 22, 2018, 11:34am

I get why the input has to be normalized, but if the values are between 0 and 1 isn’t that already considered normalized? Why -1 and 1?

Deeply · July 22, 2018, 11:44am

I guess that depends on the activation function(s) used. If you are using Sigmoid, then you are better off with [0, 1] normalization, else if you are using Tan-Sigmoid then [-1, 1] normalization will do. The normalization might, in many occasions, affect the time your network needs to converge; as the synaptic weights will adapt to the situation with time.

stroncea · October 23, 2020, 1:05am

To anybody looking for a more universal solution for custom datasets, this is what worked for me:

# Note: data type must be numpy.ndarray
# example of data shape: (50000, 32, 32, 3). Channel is last dimension
data = training_set.data
# find mean and std for each channel, then put it in the range 0..1
mean = np.round(data.mean(axis=(0,1,2))/255,4)
std = np.round(data.std(axis=(0,1,2))/255,4)
print(f"mean: {mean}\nstd: {std}")

moreshud · December 11, 2020, 3:21pm

Thanks for the explanation

qiaoyu1002 · May 19, 2022, 1:25am

    train_transform = transforms.Compose([transforms.ToTensor()])
    train_set = torchvision.datasets.MNIST(root=data_dir, train=True, download=True, transform=train_transform)
    print("min:%f max:%f" %(train_set.data.min(), train_set.data.max())) #0,255

As we know, transforms.ToTensor() is to let be in [0,1], why the above result’s maximum is 255.

I feel very confused. Can anyone can help me? Thank you in advance

ptrblck · May 19, 2022, 5:19am

You are directly indexing the internal .data attribute which contains the entire unprocessed samples.
If you want to apply the transformations you would need to index ir iterate the train_set e.g. via train_set[0].min().

qiaoyu1002 · May 19, 2022, 5:59am

Thank you so much for your explanation. It solved my confuse

kawayip · April 12, 2023, 3:32am

Is (0.1307,) the same as [0.1307] or [0.1307, 0.1307, 0.1307]? Thanks.