Extract deep features from inception_v3

algeriapy · June 1, 2020, 11:29pm

When I tried to extract deep features using trained inception_v3 model

model = torchvision.models.inception_v3(pretrained=True)
model.fc = nn.Linear(2048, 1)
model.load_state_dict(torch.load(’./models/Beauty_inception_reg.pt’))
feature_extractor = torch.nn.Sequential(*list(model.children())[:-1])

I got the following error :
File “c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\conv.py”, line 345, in forward
return self.conv2d_forward(input, self.weight)

File “c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\conv.py”, line 342, in conv2d_forward
self.padding, self.dilation, self.groups)

RuntimeError: Expected 4-dimensional input for 4-dimensional weight 192 768 1 1, but got 2-dimensional input of size [1, 1000] instead

any Ideas what is causing that error ??

ptrblck · June 2, 2020, 7:41am

Wrapping child modules into an nn.Sequential container will only work in simple use cases, where each module is called sequentially and no functional calls are used in the forward method.

As you can see here, the Inception model uses some functional pooling layers, conditions, dropout, and a flattening inside the forward. All these functional API calls will be missing in your feature_extractor.

You could thus derive a custom model and manipulate the forward method as you wish or alternatively replace the unwanted layers with an nn.Identity to get the desired output.

algeriapy · June 2, 2020, 8:40pm

Thanks a lot.
I solved it by adding

model.fc = nn.Identity()

algeriapy · June 3, 2020, 6:31pm

I have another question
I am trying to train Resnet50 and Inception_v3 simultaneously
I make the following class

class TwoInputsNet(nn.Module):
  def __init__(self):
    super(TwoInputsNet, self).__init__()
    self.model1 = torchvision.models.resnet50(pretrained=True)
    self.model1.fc = nn.Linear(2048, 1024)        
    self.model2 = torchvision.models.inception_v3(pretrained=True)
    self.model2.fc = nn.Linear(2048, 1024)
    self.fc2 = nn.Linear(2048, 1)


  def forward(self, input1, input2):
    c = self.model1(input1)
    f = self.model2(input2)
    combined = torch.cat((c,torch.Tensor(f)), dim=1)

    out = self.fc2(combined)
    return out

and I am getting the following error:

  File "D:/Face Beauty Pytorch/transfer_learning_inception_resnet50_regression.py", line 152, in <module>
    preds = model(images1, images2)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)

  File "D:/Face Beauty Pytorch/transfer_learning_inception_resnet50_regression.py", line 82, in forward
    f = self.model2(input2)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torchvision\models\inception.py", line 132, in forward
    aux = self.AuxLogits(x)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torchvision\models\inception.py", line 332, in forward
    x = self.conv1(x)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torchvision\models\inception.py", line 353, in forward
    x = self.bn(x)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\modules\batchnorm.py", line 81, in forward
    exponential_average_factor, self.eps)

  File "c:\users\fares\appdata\local\programs\python\python37\lib\site-packages\torch\nn\functional.py", line 1666, in batch_norm
    raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 768, 1, 1])

Could you suggest any solution to this issue ??

ptrblck · June 4, 2020, 3:16am

Inception_v3 needs more than a single sample during training as at some point inside the model the activation will have the shape [batch_size, 768, 1, 1] and thus the batchnorm layer won’t be able to calculate the batch statistics.
You could set the model to eval(), which will use the running statistics instead or increase the batch size.

algeriapy · June 4, 2020, 10:59am

Thanks a lot,
I am getting new error

File “D:/Face Beauty Pytorch/transfer_learning_inception_resnet50_regression.py”, line 90, in forward
combined = torch.cat(c,torch.Tensor(f), dim=1)

ValueError: only one element tensors can be converted to Python scalars

ptrblck · June 5, 2020, 12:42am

Could you post the shapes of c and f, please?

algeriapy · June 5, 2020, 2:14pm

Sure, I tried to summarize everything:
When I tried to print the shape of c and f I got

torch.Size([2, 1024])

print(f.shape)
AttributeError: ‘InceptionOutputs’ object has no attribute ‘shape’

so I tried

   def forward(self, input1, input2):
     c = self.model1(input1)
     f = self.model2(input2) 
     print(c.shape)
     l, ll= f
     print(l.shape)
     combined = torch.cat(c,torch.Tensor(f), dim=1) 
     out = self.fc2(combined)
     return out

and got

torch.Size([2, 1024])
torch.Size([2, 1024])
combined = torch.cat(c,torch.Tensor(f), dim=1)
ValueError: only one element tensors can be converted to Python scalars

then I tried

  def forward(self, input1, input2):
    c = self.model1(input1)
    f = self.model2(input2) 
    print(c.shape)
    l, ll= f
    print(l.shape)
    combined = torch.cat(c,torch.Tensor(l), dim=1)
    out = self.fc2(combined)
    return out

the code did not run at all and it did not return any error

runfile('D:/Face Beauty Pytorch/transfer_learning_inception_resnet50_regression.py', wdir='D:/Face Beauty Pytorch')
  0%|          | 0/1100 [00:00<?, ?it/s]

ptrblck · June 5, 2020, 11:21pm

Thanks for the update.
InceptionOutputs is a namedtuple containing the logits and aux_logits, so your approach of passing the first return value (logits) to torch.cat should be correct.

algeriapy · June 6, 2020, 12:11pm

Thanks so much,
It is working now

jerpint · March 29, 2021, 12:28pm

If anyone is looking for a way to extract the features of inception_v3 layer by layer:

from torchvision.models.inception import Inception3
from torchvision.models.utils import load_state_dict_from_url
import torch.nn.functional as F


model_urls = {
    # Inception v3 ported from TensorFlow
    'inception_v3_google': 'https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth',
}
def inception_v3_sliced(pretrained=False, progress=True, stop_layer=3, **kwargs):
    r"""Inception v3 model architecture from
    `"Rethinking the Inception Architecture for Computer Vision" <http://arxiv.org/abs/1512.00567>`_.
    .. note::
        **Important**: In contrast to the other models the inception_v3 expects tensors with a size of
        N x 3 x 299 x 299, so ensure your images are sized accordingly.
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
        aux_logits (bool): If True, add an auxiliary branch that can improve training.
            Default: *True*
        transform_input (bool): If True, preprocesses the input according to the method with which it
            was trained on ImageNet. Default: *False*
    """
    if pretrained:
        if 'transform_input' not in kwargs:
            kwargs['transform_input'] = True
        if 'aux_logits' in kwargs:
            original_aux_logits = kwargs['aux_logits']
            kwargs['aux_logits'] = True
        else:
            original_aux_logits = True
        kwargs['init_weights'] = False  # we are loading weights from a pretrained model
        class Inception3Mod(Inception3):
          def __init__(self, stop_layer, **kwargs):
            super(Inception3Mod, self).__init__(**kwargs)
            self.stop_layer = stop_layer
          def _forward(self, x):
            layers = [
             self.Conv2d_1a_3x3,
             self.Conv2d_2a_3x3,
             self.Conv2d_2b_3x3,
             'maxpool',
             self.Conv2d_3b_1x1,
             self.Conv2d_4a_3x3,
             'maxpool',
             self.Mixed_5b,
             self.Mixed_5c,
             self.Mixed_5d,
             self.Mixed_6a,
             self.Mixed_6b,
             self.Mixed_6c,
             self.Mixed_6d,
             self.Mixed_6e,
             self.Mixed_7a,
             self.Mixed_7b,
             self.Mixed_7c,
            ]

            for idx in range(self.stop_layer):
              layer = layers[idx]
              if layer == 'maxpool':
                x = F.max_pool2d(x, kernel_size=3, stride=2)
              else:
                x = layer(x)
            return x, None


        model = Inception3Mod(**kwargs, stop_layer=stop_layer)
        state_dict = load_state_dict_from_url(model_urls['inception_v3_google'],
                                              progress=progress)
        model.load_state_dict(state_dict)
        if not original_aux_logits:
            model.aux_logits = False
            del model.AuxLogits
        return model

    return Inception3Mod(**kwargs)

then call

model = inception_v3_sliced(pretrained=True, stop_layer=12)

jorisvane · June 23, 2021, 1:00pm

Can you show the full code where you managed to get inceptionv3 without the classification layer working?

algeriapy · June 23, 2021, 7:33pm

Hi Jorisvane,

model = torchvision.models.inception_v3(pretrained=True)
model.fc = nn.Identity()
model.eval

then you can have the deep features