Inceptionv3 for different image size

Hello, Pytorch forum!
I am looking for an example of modifying and fine tuning the pretrained inceptionV3 for different image sizes! Any hint?

Fine tuning makes sense. Look at the ImageNet example or the Transfer learning tutorial.

I suspect it’ll be easier to scale and/or crop your images than to try to adapt InceptionV3 to a different image size. What size images do you have?

For smaller images, you’ll have to zero-pad or scale and crop them.

For larger images, you can scale and crop them or apply them in a “fully convolutional” manner. Scaling and cropping will be more efficient.

To apply them in a FC manner, replace the nn.Linear layers with 1x1 nn.Conv2d convolutions. You’ll then get multiple predictions per-image. This can be significantly more computationally expensive, but might give you a bit better predictions. (Image detection and segmentation networks tend to use this method.)

You could also try adding layers to the Inception definition, but I wouldn’t recommend it. I don’t think the initializing from the pre-trained weights would help much in that case.

Here’s the inception model definition for reference.

Again, I think your best bet is to scale and crop your images. But if that’s not acceptable for some reason, you can try one of the methods above.

I had to use a specific size image and resize/scale was not an option. So, I used a code like:

class CustomInceptionV3(models.Inception3):
def init(self, model_orig, num_classes):
super(CustomInceptionV3, self).init()
num_feats = model_orig.fc.in_features
self.fc = nn.Linear(num_feats, num_classes)
self.aux_logits = model_orig.aux_logits

    self.Conv2d_1a_3x3 = model_orig.Conv2d_1a_3x3
    self.Conv2d_2a_3x3 = model_orig.Conv2d_2a_3x3
    self.Conv2d_2b_3x3 = model_orig.Conv2d_2b_3x3

    self.Conv2d_3b_1x1 = model_orig.Conv2d_3b_1x1
    self.Conv2d_4a_3x3 = model_orig.Conv2d_4a_3x3

    self.Mixed_5b = model_orig.Mixed_5b
    self.Mixed_5c = model_orig.Mixed_5c
    self.Mixed_5d = model_orig.Mixed_5d
    self.Mixed_6a = model_orig.Mixed_6a
    self.Mixed_6b = model_orig.Mixed_6b
    self.Mixed_6c = model_orig.Mixed_6c
    self.Mixed_6d = model_orig.Mixed_6d
    self.Mixed_6e = model_orig.Mixed_6e
    self.Mixed_7a = model_orig.Mixed_7a
    self.Mixed_7b = model_orig.Mixed_7b
    self.Mixed_7c = model_orig.Mixed_7c

def forward(self, x):

    if self.transform_input:
        x = x.clone()
        x[:, 0] = x[:, 0] * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
        x[:, 1] = x[:, 1] * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
        x[:, 2] = x[:, 2] * (0.225 / 0.5) + (0.406 - 0.5) / 0.5
    x = self.Conv2d_1a_3x3(x)
    x = self.Conv2d_2a_3x3(x)
    x = self.Conv2d_2b_3x3(x)
    x = F.max_pool2d(x, kernel_size=3, stride=2)
    x = self.Conv2d_3b_1x1(x)
    x = self.Conv2d_4a_3x3(x)
    x = F.max_pool2d(x, kernel_size=3, stride=2)
    x = self.Mixed_5b(x)
    x = self.Mixed_5c(x)
    x = self.Mixed_5d(x)
    x = self.Mixed_6a(x)
    x = self.Mixed_6b(x)
    x = self.Mixed_6c(x)
    x = self.Mixed_6d(x)
    x = self.Mixed_6e(x)
    if and self.aux_logits:
        aux = self.AuxLogits(x)
    x = self.Mixed_7a(x)
    x = self.Mixed_7b(x)
    x = self.Mixed_7c(x)
    x = F.adaptive_avg_pool2d(x, 1)
    x = x.view(x.size(0), -1)
    x = self.fc(x)
    if and self.aux_logits:
        return x, aux
    return x

class CustomInceptionAux(nn.Module):

def __init__(self, in_channels, num_classes):
    super(CustomInceptionAux, self).__init__()
    self.conv0 = BasicConv2d(in_channels, 128, kernel_size=1)
    self.conv1 = BasicConv2d(128, 768, kernel_size=5)
    self.conv1.stddev = 0.01
    num_feats = 768
    self.fc = nn.Linear(num_feats, num_classes)

def forward(self, x):
    x = F.adaptive_avg_pool2d(x, 5)
    x = self.conv0(x)
    x = self.conv1(x)
    x = x.view(x.size(0), -1)
    x = self.fc(x)
    return x

class BasicConv2d(nn.Module):

def __init__(self, in_channels, out_channels, **kwargs):
    super(BasicConv2d, self).__init__()
    self.conv = nn.Conv2d(in_channels, out_channels, bias=False, **kwargs) = nn.BatchNorm2d(out_channels, eps=0.001)

def forward(self, x):
    x = self.conv(x)
    x =
    return F.relu(x, inplace=True)

model_orig = torchvision.models.inception_v3(pretrained=True)
model = CustomInceptionV3(model_orig, num_classes=4)
model.AuxLogits = CustomInceptionAux(768, 4) = = = =

After change the code to make available of variable size of images. Have you test the performance? Is it same as the original one?

I tested the performance of the model on new 512*512 images, did not do comparison. However, as @colesbury suggested, the best option for different size images is resizing the images and tuning the pretrained networks for your application.

@bmarami is it possible to extend this, up to 1024*1204 input? I am currently using Detectnet and have a very large dataset already adjusted to this size. I want to try a different network but size is a problem. I cannot resize dataset since there are big and very small objects together in images.

because of F.adaptive_avg_pool2d() this would work (I mean without giving error) for any image size, however, the accuracy of the pre-trained network would be low and you have to retrained the network.
Again, I think you would get better results if you downsample images at the beginning, rather than using a larger kernel for pooling at the end.

Thank you for your comments, the problem is, my small object size is around 35-60 pixels and downscaling will probably kill useful information. Training from scratch should not be a big deal since dataset is large enough, I have started from zero with Detectnet many times, usually it converges in first 30-50 epochs. Really wonder about accuracy with inceptionv3. Typically I was able to get over 65 mAP, for a single class.
About other OD models, are there any other models in pytorch, supporting larger inputs? (512 and beyond)

I tried to run this code but it is giving this error.
File “/home/Drive2/shrey/shrey/code/Data/BACH/Data/Training/”, line 22, in init
self.fc = nn.Linear(num_feats, num_classes)
File “/usr/local/lib/python2.7/dist-packages/torch/nn/modules/”, line 424, in setattr
“cannot assign module before Module.init() call”)
AttributeError: cannot assign module before Module.init() call
Can you please help me with this?