Question about inception v3 model

will_soon · October 16, 2017, 11:01am

In inception v3 model, the input is transformed:
TIM图片20171016185612
I don’t understand why pytorch does like this.

When we normalize this input , we always transform like this:
input[channel] = (input[channel] - mean[channel]) / std[channel].

But what does it mean what pytorch does? Normalize?

Hoping anyone to show some his opinion.

Thanks very much.

chenyuntc · October 16, 2017, 1:53pm

I guess these numbers are the average of all images from imagenet

it’s doing something of reverse-transform or unnormalize which transform the data to range(-1,1).

github.com

pytorch/examples/blob/master/imagenet/main.py#L116-L117


normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

colesbury · October 17, 2017, 5:27am

It’s to transform inputs normalized one way into inputs normalized differently. The Inceptionv3 model uses the pre-trained weights from Google. They assume images are normalized by: (img - 0.5) / 0.5 so that inputs are between -1 and 1.

The PyTorch data loader transforms images using (roughly):

mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]
img[c] = (img[c] - mean[c]) / std[c]

This makes the ImageNet training set have zero-mean and unit-variance.

The code in the inception model reverses the PyTorch transformation and applies the transform from Inceptionv3.