How does transforms.ToTensor() work and computation of mean and std values


  1. I am currently using the transforms.ToTensor(). As per the document it converts data in the range 0-255 to 0-1.
    However, the transform work on data whose values ranges between negative to positive values? Any ideas how this transform work. And the transformed values no longer strictly positive.

  2. In most tutorials regarding the finetuning using pretrained models, the data is normalized with [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]). I would like to know how these values are computed? Are they computed by attained by using mean & std value of each channel from the entire training data?


ToTensor() works for the image, whose elements are in range 0 to 255. You can write your custom Transforms to suit your needs.

[0.485, 0.456, 0.406] is the normalized mean value of ImageNet, and [0.229, 0.224, 0.225] denotes the std of ImageNet.

Yes, it is computed per channels.


Oki I got it. explains. The entire array is converted to torch tensor and then divided by 255. This is how it is forces the network to be between 0 and 1.


Actually, I found it is different from *1.0/255 in C++. The results between transforms.ToTensor() in Python and *1.0/255 in C++ are not same.

I am wondering why…

My gut instinct was * 1.0 / 256 instead of * 1.0 / 255. Does this fix the problem, @LvJC?

1 Like