# Best practices with data processing and normalization --- images seem OK but is normalization OK?

Hi all,

I am hoping to confirm that what I did for data processing and visualizing images makes sense. I am doing a binary classification problem where images are (480,640,3)-sized depth images of blankets on a table-like surface.

I have the following two gist codes here which one should be able to run in the same directory if I have set things correctly:

The first one has the data there (see bottom post) and loads it for `ImageLoader`. It also computes mean and standard deviation, by putting all the `numbers.extend( d_img[:,:,0].flatten() )` stuff into a `numbers` list and then taking a mean and standard deviation. The mean turns out to be 93 and the standard deviation is 84. It’s high because I have lots of 0s and lots of brighter values.

First question: this is a correct way of computing per-channel mean? The depth images are replicated across 3 channels so the values would be the same across all channels. I imagine there is a more efficient way to do this, though, perhaps dynamically computing the standard deviation somehow? And also, I see on the ImageNet examples that the mean values are within [0,1], so I am not sure if the values here should be scaled as well …

Next, I went ahead to train the model (see second gist). I put this at the top:

``````MEAN = [93.8304761096, 93.8304761096, 93.8304761096]
STD = [84.9985507432, 84.9985507432, 84.9985507432]
``````

because, again, data is replicated across three channels.

Here’s the data transforms:

``````    data_transforms = {
'train': transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.ToTensor(),
transforms.Normalize(MEAN, STD)
]),
'valid': transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(MEAN, STD)
]),
}
``````

I used a pre-trained ResNet-18 model. I went into the training loop and took the first minibatch. Then, I saved all the images. It took me a long time to figure out the correct way to get the images back to what I wanted: it’s in `_save_images` in the second gist. This will save into a directory and I see depth images that make sense, and which have been cropped correctly as you can see later in the second gist. I wanted to visualize the transforms. I had to do something like this:

But what is confusing is that I needed to do this snippet (see Gist for details)

``````        img = img.transpose((1,2,0))
img = img*STD + MEAN
img = img*255.0
img = img.astype(int)
``````

transpose to get it into (224,224,3), then undo STD, MEAN, and this is really weird: we then multiply by 255. I assume this undoes the scaling that the `ToTensor()` transform does?

Second set of question(s): the data transformation that I used above makes sense (MEAN and STD on the domain data of interest), and the `ToTensor()` method can be undone by multiplying the image by 255? Then, I assume MEAN and STD are correctly “adjusted” so that they reflect the rescaled image where pixels are in [0,1], rather than [0,255] as previously?

Sorry for the long message! I just wanted to make sure I was understanding PyTorch correctly. I’m happy to clarify anything,

The calculation of the mean and std on your images looks good.

There is a small issue in your transformation.
As you said, the mean and std for the ImageNet data is smaller than yours, because it was calculated on the normalized tensors.
`ToTensor` will transform your `PIL.Images` to normalized `tensors` in the range `[0, 1]`.
If you are using `Normalize` afterwards, you should make sure to use the mean and std calculated on these tensor images in the range `[0, 1]`. However, since you’ve already computed these values, you could just scale them with `1./255`.

The same applies to undo the normalization using the mean and std.
In your current code snippet you are assuming mean and std were calculated on the normalized tensors.

1 Like

Thanks @ptrblck

I fixed the code a bit. The issue is that, while saving the images should look good, this is not the correct way to normalize data. The way I had it earlier, if you take the mean and std from pixels in the range [0,255], then the data gets transformed like this:

• `ToTensor` transforms images and scales into range [0,1]
• Then `Normalize` will do this: ([0,1] - rawmean) / rawstd

What we really want is the scaled mean and scaled std, as you pointed out (where by scaled data, I mean values in the range [0,1] not [0,255]). Of course, for undoing, it’s correct either way:

(([0,1] - rawmean) / rawstd) * rawstd + rawmean = [0,1]
and then we multiply by 255.

or

(([0,1] - scaledmean) / scaledstd) * scaledstd + scaledmean = [0,1]
and then we multiply by 255.

I will simply use scaled mean and scaled std on my data from now on.

2 Likes