Transfer learning image size

Patrick_Conway · September 20, 2017, 3:52pm

I am following the tutotial for transfer learning

http://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

I wish to train on a custom dataset, which cannot be cropped as it will result in relevant data being lost.

224x224 is too small for my use case

Maybe I could resize my data to 480x640

But I would prefer not to alter the images.

When I try to train the model I get an error on size mismatch.
It seems the implementation of the model only allows for images which are 224 x 224.

Is this correct?

Looking at the model

github.com

pytorch/vision/blob/master/torchvision/models/resnet.py

import torch.nn as nn
import math
import torch.utils.model_zoo as model_zoo


__all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101',
           'resnet152']


model_urls = {
    'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
    'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
    'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
    'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
    'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
}


def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""

This file has been truncated. show original

vs the torch versions
There is a single kernel size 7
and single input to AvgPool
Which suggests that the input must be square
32*7=224

In torch

github.com

facebook/fb.resnet.torch/blob/master/models/resnet.lua

--
--  Copyright (c) 2016, Facebook, Inc.
--  All rights reserved.
--
--  This source code is licensed under the BSD-style license found in the
--  LICENSE file in the root directory of this source tree. An additional grant
--  of patent rights can be found in the PATENTS file in the same directory.
--
--  The ResNet model definition
--

local nn = require 'nn'
require 'cunn'

local Convolution = cudnn.SpatialConvolution
local Avg = cudnn.SpatialAveragePooling
local ReLU = cudnn.ReLU
local Max = nn.SpatialMaxPooling
local SBatchNorm = nn.SpatialBatchNormalization

This file has been truncated. show original

model:add(Convolution(3,64,7,7,2,2,3,3))

in pytorch
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
bias=False)

If I change the torch model to model:add(Convolution(3,64,15,20,2,2,3,3))
It will at least allow me to train with 480*640 images…although it will not allow me to fine tune a pretrained model

So I basically have 3 questions

To train on different image sizes can the pretrained models be used?
Do all images in training have to be the same size (I thought fully convolutional networks would allow any input size …this training works with tensorflow and inception-v3)?
How do I fine tune a model with images which are not 224*224?

FuriouslyCurious · September 20, 2017, 8:01pm

You need to modify the size of your last layer after convolutions finish.

FuriouslyCurious · September 20, 2017, 8:02pm

Look at one of my earlier posts on this topic for a solution.

Patrick_Conway · September 21, 2017, 2:17pm

Thanks for the pointer.

Used
model.avgpool = nn.AdaptiveAvgPool2d(1)
To get this to work