How to extract features of an image from a trained model

Because the input size is different, the output size of the last convolutional layer will be different as well.
This means that when you view the last feature map to apply the fully-connected layers, there will be a size mismatch.
One thing that you can do is to replace the last pooling layer by an AdaptiveMaxPool2d , which will enforce that the output of the last activation will have the same size as before.

I do not need fully-connected layers, I just need the conv1_1 ~ relu5_3 layers of vgg16, and get the output of relu5_3 ,given a 256256 input. How to force the pretrained weights of conv1_1 ~ relu5_3 layers to accept 256256 input?

You donā€™t need to do anything to support different sizes for the convolutional layers.
I just tried your example with an input of size 1x3x256x256 and it worked just fine.

Does the input image pixel values need to be in [0,1]? or [0,255]?

The information about how to pre-process the models can be found in

Regarding this approach

new_classifier = nn.Sequential(*list(model.classifier.children())[:-1])
model.classifier = new_classifier

This method worked for vgg16. But unfortunately does not work for inception_v3, as the model does not have a classifier. What would be the best approach for extracting features from inception_v3 model.


Because of the way Inception_v3 is structured, you would need to do some more manual work to get the layers that you want.
Something on the lines of

class MyInceptionFeatureExtractor(nn.Module):
    def __init__(self, inception, transform_input=False):
        super(MyInceptionFeatureExtractor, self).__init__()
        self.transform_input = transform_input
        self.Conv2d_1a_3x3 = inception.Conv2d_1a_3x3
        self.Conv2d_2a_3x3 = inception.Conv2d_2a_3x3
        self.Conv2d_2b_3x3 = inception.Conv2d_3a_3x3
        self.Conv2d_3b_1x1 = inception.Conv2d_3b_3x3
        self.Conv2d_4a_3x3 = inception.Conv2d_4a_3x3
        self.Mixed_5b = inception.Mixed_5b
        # stop where you want, copy paste from the model def

    def forward(self, x):
        if self.transform_input:
            x = x.clone()
            x[0] = x[0] * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
            x[1] = x[1] * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
            x[2] = x[2] * (0.225 / 0.5) + (0.406 - 0.5) / 0.5
        # 299 x 299 x 3
        x = self.Conv2d_1a_3x3(x)
        # 149 x 149 x 32
        x = self.Conv2d_2a_3x3(x)
        # 147 x 147 x 32
        x = self.Conv2d_2b_3x3(x)
        # 147 x 147 x 64
        x = F.max_pool2d(x, kernel_size=3, stride=2)
        # 73 x 73 x 64
        x = self.Conv2d_3b_1x1(x)
        # 73 x 73 x 80
        x = self.Conv2d_4a_3x3(x)
        # 71 x 71 x 192
        x = F.max_pool2d(x, kernel_size=3, stride=2)
        # 35 x 35 x 192
        x = self.Mixed_5b(x)
        # copy paste from model definition, just stopping where you want
        return x

inception = torchvision.models['inception_v3_google']
my_inception = MyInceptionFeatureExtractor(inception)

Thanks for the quick response,
I get the gist of how to do it.

inception = torchvision.models['inception_v3']
inception does not have attributes ā€˜Conv2d_1a_3x3ā€™

How do I access them using inception?

It should have these modules, as shown in the link in my previous message.
If not, you are probably using a DataParallel on top of inception, so you first need to get the module back, by using inception.module.Conv2_1a_3x3. Just to be sure, print your model and inspect the names of the modules that are printed

Got it to work. Thanks :slight_smile:

so how to perform the backward process of a model with multiple outputs? Is it like this?:

class Net(nn.Module):
    def __init__(self):
    self.conv1 = nn.Conv2d(1, 1, 3)
    self.conv2 = nn.Conv2d(1, 1, 3)
    self.conv3 = nn.Conv2d(1, 1, 3)

def forward(self, x):
    out1 = F.relu(self.conv1(x))
    out2 = F.relu(self.conv2(out1))
    out3 = F.relu(self.conv3(out2))
    return out1, out2, out3

o1,o2,o3 = model.forward(input)
loss1 =criterion(o1,target1)
loss2 =criterion(o2,target2)
loss3 =criterion(o3,target3)

If it is not like this, how to make this work?

You can use autograd.backward, or just sum the losses and backprop the summed losses

loss = loss1 + loss2 + loss3

I wrote a demo:

criterion = nn.MSELoss()
x = Variable(torch.randn(1,100), requires_grad=True)
y = Variable(torch.randn(1,40))

class ToyModel(nn.Module):
    def __init__(self):
        super(ToyModel, self).__init__()
        self.linear1 = nn.Linear(100,50)
        self.linear2 = nn.Linear(50,40)
        self.linear3 = nn.Linear(100,40)

    def forward(self, x):
        out1 = self.linear1(x)
        out2 = self.linear2(out1)
        out3 = self.linear3(x)
        return out2,out3

out2,out3 = model.forward(x)
print out3
loss1= criterion(out2,y)
loss2 = criterion(out3,y)
#print out2.grad

torch.autograd.backward(x, [grad1, grad2])

But how to get grad1 and grad2 ?
print out2.grad just gives me 'Noneā€™
Is there any bug in my code? Thanks!

But how to get grad1 and grad2 ?

you donā€™t have to, thatā€™s why Pytorch is amazing:

out2,out3 = model.forward(x)
loss1= criterion(out2,y)
loss2 = criterion(out3,y)
loss = loss1 + loss2

If I modify it to:
loss = loss1 + 0.8 * loss2
then it is a weighted loss?

Yes, it is weighted. You can mix the losses as you want.

you need to use hooks if you want to inspect gradients of intermediate variables.
Refer to the discussion here: Why cant I see .grad of an intermediate variable?

In your case you need to attach a hook to out2 and out3, which returns the ā€˜gradā€™