Image size for training in VGG

Hi, I have dataset with the size of 720x1280. I want to train this images using VGG16 via transfer learning. Do I need to resize the images first into 224 x 224 so that I will fit into the VGG16 dimension or I don’t have to do that?

Thank you

Larger images should work in this model, as torchvision models use an adaptive pooling layer before feeding the activation to the first linear layer.
However, since the model was pretrained on smaller images (224x224), you might want to fine-tune the conv layers so that they could extract the features from the current resolution, but it depends on your use case of course.

1 Like

as mentioned above VGG16 was trained on much smaller images, so it might not work as good on your data. In order to overcome this issue, you can add conv layer before feeding to VGG16, so the sizes after your conv layer will be at the same scale.

1 Like

Thank you for your reply. Is it fine if I resize the images first, for example into 180x320, before I put into VGG16?

you can but you may need to adapt the size of the linear layers.
The issue mentioned before is more conceptual than technical- you have large images and the model was trained with different dataset, hence it may fail to perform on your images.

Thank you. So what you suggest is that I should put conv layer first which converts my dataset from 720x1280 into 224x224 then use VGG16 after that, correct?

You can
or you can do as you suggested
sub sample your data set and feed to vgg16

1 Like