The Pytorch doc says:
“All pre-trained models expect input images normalized in the same way,” … “The images have to be loaded in to a range of [0, 1]
and then normalized using mean = [0.485, 0.456, 0.406]
and std = [0.229, 0.224, 0.225]
.”
Mathematically, the normalization to a given mean and std should give the same result regardless of any prior linear scaling.
Maybe I’m misunderstanding what is meant by “loading in to a range”, but if it means an affine transform that maps the min to 0 and max to 1, performing this operation prior to the normalization should have no effect?
That’s correct and you could skip the [0, 1]
normalization and use the mean
and std
stats for the raw input values in [0, 255]
. However, since ToTensor()
is often used in the transformations to transform a PIL.Image
in [0, 255]
to a tensor in [0, 1]
, I believe this workflow grew organically.
1 Like