I am confused about the operation of subtracting mean in every image.
- If I want to finetune a network (pretrained on the ImageNet) on my dataset, should I subtract the ImageNet mean or my dataset mean? I think if I want to finetune on a new dataset, I should substract the new dataset mean, and if I just want to test on this new dataset, I should substract the ImageNet mean. Am I right?
- When calculate the dataset the mean and the std, should I just calculate it on train dataset, or on both train and test dataset?
- Should I calculate the std and the mean on the original image, or the resized image for training?
- How to calculate the image mean and std when the dataset is very large? It is easier to calculate the mean, I just need to operate the tensor. But how to calculating the std? I concatenate every tensor(after be viewed as (3, -1) shape), but it is not possible for a large dataset. It takes too much memory.
- In the pytorch implementation, the preprocessing is
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
, but in the Caffe implementation, the mean is[123, 117, 104]
, I guess it because the image value in pytorch is[0, 1]
, and in Caffe is[0, 255]
. Am I right? But where does thestd = [0.229, 0.224, 0.225]
come from? It doesn’t appear in the Caffe implementation. Is the std necessary? Why the Caffe implementation just normalizes the data using the mean but not the std?
Could anyone help figure it out? Thanks!
HELP!