Hi all,
I was wondering, when using the pretrained networks of torchvision.models module, what preprocessing should be done on the input images we give them ?
For instance I remember that if you use VGG 19 layers you should substract the following means [103.939, 116.779, 123.68].
Where can I find these numbers (and even better with std infos) for alexnet, resnet and squeezenet ?

All pretrained torchvision models have the same preprocessing, which is to normalize using the following mean/std values: (input is RGB format)

Hi, it looks like the pixel intensities have been rescaled to [0 1] before normalization. It that right?


@qianguih yes they have to be RGB normalized to [0, 1] before further applying the normalization that I pointed out.

This is important information, I wonder it’s not put in the doc but in the example code?

Agreed. If it wasn’t for this thread, I would have missed this important Normalization step for sure. It would be nice if it could be added to the documentation.

This is pretty key information. Without doing this, and only doing mean centering and stddev normalization of the original Hunsfield units, I need to keep batch normalization enabled during test to see reasonable results from my volumetric segmentation network.

This should really be in bold somewhere.