"All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using
mean = [0.485, 0.456, 0.406] and
std = [0.229, 0.224, 0.225]"
what does mean " expect input images " in this description ?
I’m applying style transfer, so what I’m doing actually is normalizing by the mean and the std after the image is generated by by model and also normalizing the target image, then I’m extracting features from VGG.
what I understand from this sentence and from some source I found on the web that I have to normalize only the target image and not my generated image before extracting features and then compare the features.