How to preprocess input for pre trained networks?


(Tristan Stérin) #1

Hi all,
I was wondering, when using the pretrained networks of torchvision.models module, what preprocessing should be done on the input images we give them ?
For instance I remember that if you use VGG 19 layers you should substract the following means [103.939, 116.779, 123.68].
Where can I find these numbers (and even better with std infos) for alexnet, resnet and squeezenet ?

Thank you very much


#2

All pretrained torchvision models have the same preprocessing, which is to normalize using the following mean/std values: https://github.com/pytorch/examples/blob/master/imagenet/main.py#L92-L93 (input is RGB format)


Worse results with Pytorch data preprocessing?
How the means and stds get calculated in the Transfer Learning tutorial
(Tristan Stérin) #3

Thank you very much!


(qianguih) #4

Hi, it looks like the pixel intensities have been rescaled to [0 1] before normalization. It that right?


#5

@qianguih yes they have to be RGB normalized to [0, 1] before further applying the normalization that I pointed out.


(qianguih) #6

I see. Thank you very much!


(Ecolss) #7

This is important information, I wonder it’s not put in the doc but in the example code?


(Mehdi Shibahara) #8

Agreed. If it wasn’t for this thread, I would have missed this important Normalization step for sure. It would be nice if it could be added to the documentation.


(Matthew Macy) #9

This is pretty key information. Without doing this, and only doing mean centering and stddev normalization of the original Hunsfield units, I need to keep batch normalization enabled during test to see reasonable results from my volumetric segmentation network.

This should really be in bold somewhere.