Normalizing with imagenet mean and std vs normalizing with my own dataset's mean and std

I am using a pre-trained network with imagenet data .Tthe mean and std of imagenet are
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
whereas, I have 1000 images in my dataset.
Mean and std of my dataset are :
mean = [0.5589, 0.3216, 0.2356]
std =[ 0.3056, 0.21442, 0.1775]
I am confused if I should use imagenet MEAN AND STD or use my own mean and std to normalize my images?

Is there any way to find if fine-tuning is good or feature extraction i.e. updating the last layer in good for this case?

You could try both approaches and check the training as well as validation losses.
Let us know, which stats worked better for your use case. :wink:

1 Like

is not this a strange answer! or this how deep learning works! :sweat_smile:

I would claim it’s unfortunately how the majority of things work and the answer is often: “it depends”.
E.g. commonly it’s believed fine-tuning the last layer only while freezing the feature extractor works if the data comes from a similar domain (e.g. ImageNet used for pretraining while your current dataset also contains “natural” images). Training more layers (including the feature extractor) could be helpful if the data domain changes (e.g. if you are now using medical CT scans). While this makes sense I wouldn’t know when the data domain changes significantly enough to swap to a full model training.
E.g. would another JPEG encoding already force you to finetune the full model? What happens if your “natural” images were transformed via a filter and their colormap changed?
So, I would claim you would still have to run experiments as “it depends” :wink: