I want to use pretrained model for feature extraction (pretrained on Imagenet). I prefer GoogleNet, but I think ResNet, VGG or similar will do.
The problem is that I want to use 128X128 RGB images and I notice that the images in torchvision.models were pretrained on larger images.
Do you familiar with such pretrained model (128X128)?
Two methods are available:
One is to resize your input images to 229x229, then directly using ResNet or VGG etc.
Two is that You can revise the net-structure of ResNet, VGG etc.,making ResNet, VGG-variants accept 128x128 RGB images, then load part weights of the pretrained model ResNet or VGG，finally use your own dataset to train/finetune the revised Network.
I don’t want to resize as I’m afraid my GPU straggle with large images.
By loading part weights, you mean load the weights to only some of the layers?
If resizing to
224x224 isn’t an option, note that the
torchvision models now were updated to adaptive pooling layers (e.g. vgg16) so that the input size is now more flexible.
maybe “where H and W are expected to be at least 224.”
in torchvision.models — PyTorch 1.7.1 documentation
could be removed?
This should be the case for some models, but the minimal size would need to be determined for each model separately. Feel free to create a GitHub issue and let us know, if you would be interested in fixing this.