The VGG paper describes “multi-scale training”, a procedure whereby input images are scaled to various sizes and cropped to the expected size for training the model.
Were the VGG models in torchvision.models trained with the same multi-scale procedure?
Thanks.