How to preprocess input for pre trained networks?

tsterin · February 23, 2017, 9:06am

Hi all,
I was wondering, when using the pretrained networks of torchvision.models module, what preprocessing should be done on the input images we give them ?
For instance I remember that if you use VGG 19 layers you should substract the following means [103.939, 116.779, 123.68].
Where can I find these numbers (and even better with std infos) for alexnet, resnet and squeezenet ?

Thank you very much

smth · February 23, 2017, 8:09pm

All pretrained torchvision models have the same preprocessing, which is to normalize using the following mean/std values: https://github.com/pytorch/examples/blob/97304e232807082c2e7b54c597615dc0ad8f6173/imagenet/main.py#L197-L198 (input is RGB format)

tsterin · February 23, 2017, 9:24pm

Thank you very much!

qianguih · March 7, 2017, 9:13am

Hi, it looks like the pixel intensities have been rescaled to [0 1] before normalization. It that right?

smth · March 7, 2017, 4:33pm

@qianguih yes they have to be RGB normalized to [0, 1] before further applying the normalization that I pointed out.

qianguih · March 7, 2017, 6:22pm

I see. Thank you very much!

ecolss · March 13, 2017, 1:46am

This is important information, I wonder it’s not put in the doc but in the example code?

mehdi-shiba · March 16, 2017, 6:03am

Agreed. If it wasn’t for this thread, I would have missed this important Normalization step for sure. It would be nice if it could be added to the documentation.

mattmacy · March 16, 2017, 6:39am

This is pretty key information. Without doing this, and only doing mean centering and stddev normalization of the original Hunsfield units, I need to keep batch normalization enabled during test to see reasonable results from my volumetric segmentation network.

This should really be in bold somewhere.

youkaichao · July 27, 2018, 10:23am

can we put the mean and std inside the resnet model?just need to register a buffer.

Deeply · January 27, 2019, 12:39pm

Of course not.
Obviously, you have ‘division by zero’ somewhere. Try to debug the code to figure out the source of error.

Deeply · January 28, 2019, 9:23am

Shouldn’t the mean and std be tuples?

normalize = torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225) )

Or, it is OK to use list sequences as shown in your case?

Deeply · January 30, 2019, 1:00pm

Maybe you can remove this normalization to see if you are still having these inf values.
If yes, this means the source of error is somewhere else than Normalize.
If not, could you try a dummy normalization with
normalize = torchvision.transforms.Normalize(mean=(1, 1, 1), std=(1, 1, 1) )
?

e3312f50f3ba76f35a60 · April 8, 2019, 5:09am

If I use the pretrained model on ImageNet and fine-tune it on my own dataset, should I re-calculate the mean and std with my own dataset?

jdhao · May 14, 2019, 7:36am

Using the mean and std on ImageNet is pretty standard practice. Since the mean and std are calculated using a million of images, the statistics is pretty stable. Also the pretrained model is trained using the mean and std in ImageNet. I do not recommend changing the mean and std to that on your small dataset.

r0mer0m · May 28, 2019, 6:26pm

Thank you for your comment. I’ve been having that doubt a while ago. So basically what I infer from the comments as a summary is that the best practice is to leverage:

Stability of our images (mean/std)
Similarity of our image dataset to ImageNet

in our specific task and dataset to check what’s the best option to normalise.

Nikronic · June 8, 2019, 12:06pm

Actually, this link points to correct lines:

github.com

pytorch/examples/blob/master/imagenet/main.py#L197-L198


normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

MasayoMusic · June 9, 2019, 5:05pm

Is this applicable to all imagenet models that come with Pytorch?
I remember in keras each model had it’s own specific preprocessing function,

Nikronic · June 9, 2019, 5:25pm

Yes, this is the step has been used in all models of ImageNet.
Here is the link to ReadMe:

github.com

pytorch/examples/blob/master/imagenet/README.md#imagenet-training-in-pytorch

# ImageNet training in PyTorch

This implements training of popular model architectures, such as ResNet, AlexNet, and VGG on the ImageNet dataset.

## Requirements

- Install PyTorch ([pytorch.org](http://pytorch.org))
- `pip install -r requirements.txt`
- Download the ImageNet dataset and move validation images to labeled subfolders
    - To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

## Training

To train a model, run `main.py` with the desired model architecture and the path to the ImageNet dataset:

```bash
python main.py -a resnet18 [imagenet-folder with train and val folders]
```

The default learning rate schedule starts at 0.1 and decays by a factor of 10 every 30 epochs. This is appropriate for ResNet and models with batch normalization, but too high for AlexNet and VGG. Use 0.01 as the initial learning rate for AlexNet or VGG:

This file has been truncated. show original

All of the mentioned models use this normalization.

pp18 · June 12, 2019, 11:59am

@smth I have confusion regarding the use of same mean and std for all data. Why can’t we use per image in that way each image is independent for nomalization?