Pretrained Model for 128X128 RGB images

rmokady · March 21, 2019, 12:07pm

Hi All,
I want to use pretrained model for feature extraction (pretrained on Imagenet). I prefer GoogleNet, but I think ResNet, VGG or similar will do.
The problem is that I want to use 128X128 RGB images and I notice that the images in torchvision.models were pretrained on larger images.

Do you familiar with such pretrained model (128X128)?

Sunshine352 · March 21, 2019, 12:31pm

Two methods are available:
One is to resize your input images to 229x229, then directly using ResNet or VGG etc.
Two is that You can revise the net-structure of ResNet, VGG etc.,making ResNet, VGG-variants accept 128x128 RGB images, then load part weights of the pretrained model ResNet or VGG，finally use your own dataset to train/finetune the revised Network.

rmokady · March 21, 2019, 1:56pm

I don’t want to resize as I’m afraid my GPU straggle with large images.

By loading part weights, you mean load the weights to only some of the layers?

ptrblck · March 21, 2019, 6:22pm

If resizing to 224x224 isn’t an option, note that the torchvision models now were updated to adaptive pooling layers (e.g. vgg16) so that the input size is now more flexible.

111429 · February 11, 2021, 7:46am

thanks.
maybe “where H and W are expected to be at least 224.”
in torchvision.models — PyTorch 1.7.1 documentation
could be removed?

ptrblck · February 11, 2021, 7:47am

This should be the case for some models, but the minimal size would need to be determined for each model separately. Feel free to create a GitHub issue and let us know, if you would be interested in fixing this.