How to normalize a tensor to 0 mean and 1 variance?

evansnd · May 28, 2018, 10:18am

Hi I’m currently converting a tensor to a numpy array just so I can use sklearn.preprocessing.scale
Is there a way to achieve this in PyTorch? I have seen there is torchvision.transforms.Normalize but I can’t work out how to use this outside of the context of a dataloader. (I’m trying to use this on a tensor during training)

Thanks in advance

ptrblck · May 28, 2018, 11:20am

You could add the normalization in the __getitem__ function of your Dataset:

class MyDataset(Dataset):
    def __init__(self, X, y, transform=None):
        self.data = X
        self.target = y
        self.transform = transform
        
    def __getitem__(self, index):
        x = self.data[index]
        y = self.target[index]

        # Normalize your data here
        if self.transform:
            x = self.transform(x)

        return x, y
    
    def __len__(self):
        return len(self.data)

In this use case, you could set transform to something like this:

transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

yvanscher · May 28, 2018, 11:21am

Would this do it?

import torch
from torchvision import transforms

mu = 2
std = 0.5
t = torch.Tensor([1,2,3])
(t - 2)/0.5
# or if t is an image
transforms.Normalize(2, 0.5)(t)

see:

https://pytorch.org/docs/master/torchvision/transforms.html#torchvision.transforms.Normalize

evansnd · May 28, 2018, 12:30pm

Thanks but this one won’t work for my use case , as I am not trying to do this when I load the data, but as part of another calculation that I am performing during training.

evansnd · May 28, 2018, 12:33pm

Yeah I tried this and I always get an error:

" for t, m, s in zip(tensor, self.mean, self.std):
TypeError: zip argument #2 must support iteration"

evansnd · May 28, 2018, 12:37pm

Oh, the mean and std need to be arrays?

yvanscher · May 28, 2018, 12:38pm

so you cant zip self.mean and self.std if they are sinlge values. zip takes multiple iterables and returns packaged tuples.

means = [self.mean] * tensor.size()[0]
stds = [self.std] * tensor.size()[0]
for t, m, s in zip(tensor, means, stds):
  # do stuff

turn the means and stds into a length n array where n is the length of ‘tensor’ or tensors

evansnd · May 28, 2018, 12:39pm

Great! I see. thank you very much.

billtubbs · December 17, 2019, 9:52pm

I haven’t figured out how to use transforms.Normalize on input data that is not an image. I get TypeError: tensor is not a torch image. Is there any way to use this method on non-images?

ptrblck · December 18, 2019, 12:29am

Normalize works on tensors, so the error message might come from another transformation:

norm = transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
x = torch.randn(3, 224, 224)
out = norm(x)

billtubbs · December 18, 2019, 12:36am

That only works because your tensor has the dimensions of an Image. If you look at the documentation, it says torchvision.transforms.Normalize is used to Normalize a tensor image with mean and standard deviation. The argument is described as a

tensor ( Tensor ) – Tensor image of size (C, H, W) to be normalized.

My data is sequence data of dimension torch.Size([4, 589, 4])

billtubbs · December 18, 2019, 12:40am

Actually, you’re right the error does go away if I get the dimensions right:

norm = transforms.Normalize((30, 30, 30, 30), (20, 25, 30, 35))
x = torch.randn(4, 589, 4)
out = norm(x)

But I don’t think this is applying the normalization correctly. The data from my data loader is shaped [batch_size, seq_length, x_dim] so the scaling should be applied to the last dimension, whereas I think normalize is applying the scaling across the first dimension (the set of image colour maps).

akashs · December 29, 2019, 7:27pm

Is it possible to extend/apply the transforms.Normalize to normalize multidimensional tensor in custom pytroch dataset class? I have a tensor with shape (S x C x W x H) and I want to normalize on C dimension.

mhxin · April 3, 2020, 3:57pm

thanks, and i have a question on how to set mean and std for each channel, are they calculated from dataset?

ptrblck · April 3, 2020, 10:33pm

Yes, you can calculate the mean and std from your training dataset or use some “default” values e.g. from ImatgeNet.

krioux5 · August 3, 2020, 3:57pm

Is there a way to apply different transforms to the mask vs input? For example, I want to apply all deformation transforms to both, but i only want to normalize and totensor the predicted masks (not target)

ptrblck · August 4, 2020, 5:03am

I would recommend to use the functional API for these use cases, as it allows you to apply the same “random” transformation on the data and target, and can also be used to call some transformations on one of these tensors separately.
Have a look at this example.

ZimoNitrome · August 7, 2020, 3:04pm

Me and @FilipAndersson245 found out that the correct way to unnormalize is:

x * std + mean

We also had to clamp a few values outside of [0,1].

For a single image the code would look something like this:

def inv_normalize(img):
    mean = torch.Tensor([0.485, 0.456, 0.406]).unsqueeze(-1)
    std= torch.Tensor([0.229, 0.224, 0.225]).unsqueeze(-1)
    img = (img.view(3, -1) * std + mean).view(img.shape)
    img = img.clamp(0, 1)
    return img

Feel free to help if the code can be written in a simpler way!

_joker · August 10, 2020, 11:17pm

Hi @ptrblck, I am also trying to do transform.Normalize(mean, std) outside data-loader but somewhere in the training process. I am not sure how would I do this for a batch of images.

Also, I am using F.normalize(tensor, p=1, dim=1) inside my model. Now, If I am loading the data with transforms.Normalize(mean, std) does it mean I am applying the same Normalization twice?

I saw the source for transforms.Normalize and it appears to be using F.normalize(tensor, self.mean, self.std, self.inplace) which I am not sure is the same thing or different.

ptrblck · August 11, 2020, 7:44am

To apply transforms.Normalize on a batch you could either run this transformation in a loop on each input or normalize the data tensoe manually via:

x = (x - mean) / std

Inside transforms.Normalize the torchvision.transforms.functional API will be used as F.normalize.
This is not the same methods as torch.nn.functional.normalize and will accept different input arguments.