How to add a pretrained model into a new model class

JakeAndFinn · February 5, 2020, 3:28am

I am trying to add in a pretrained resnet50 model and then tack on 2 extra Linear (FC) layers to that. I defined a Class with a few inputs like the below but it gives me an error.

RuntimeError: size mismatch, m1: [32 x 1000], m2: [2048 x 2048] at /opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/generic/THCTensorMathBlas.cu:290

I have an idea of what this error is saying that the resnet is producing the classification (1000) output and not the 2048 from an early FC layer. I just dont know how to get the FC layer out from the resnet model. So here is what I am doing.

# Create Resnet Model
resnet = models.resnet50(pretrained=True)
num_ftrs = resnet.fc.in_features
print(num_ftrs)

And then I call it in the class like this

class FinetuneResnet(nn.Module):
    def __init__(self, resnet_pretrain, resnet_filter, num_classes):
        super(FinetuneResnet, self).__init__()
        self.resnet = resnet_pretrain
        self.fc1 = nn.Linear(resnet_filter, 2048)
        self.fc2 = nn.Linear(2048, num_classes)
        self.dropout = nn.Dropout(0.3)
    
    def forward(self, x):
        x = self.resnet(x)
        x = torch.relu(self.fc1(x))
        x = self.dropout(x)
        x = torch.softmax(self.fc2(x))
        return x

Which is used like this

model = FinetuneResnet(resnet, num_ftrs, num_classes)

Thank you for any help with this.

ptrblck · February 5, 2020, 4:44am

A simple approach of “removing” the last classification layer would be to assign an nn.Identity module to it:

model = models.resnet50()
model.fc = nn.Identity()

x = torch.randn(1, 3, 224, 224)
out = model(x)
print(out.shape)
> torch.Size([1, 2048])

This would basically skip this layer and return the penultimate activation.

vgsprasad · February 5, 2020, 4:52am

Define your model as follows.

from torchvision import models

class FinetuneResnet(nn.Module):
    def __init__(self, num_classes):
        super(FinetuneResnet, self).__init__()

        self.model = models.resnet50(pretrained=True)
        self.fc1 = nn.Linear(2048, 2048)
        self.fc2 = nn.Linear(2048, num_classes)
        self.dropout = nn.Dropout(0.3)

    def forward(self, x):
        x = self.model.conv1(x)
        x = self.model.bn1(x)
        x = self.model.relu(x)
        x = self.model.maxpool(x)

        x = self.model.layer1(x)
        x = self.model.layer2(x)
        x = self.model.layer3(x)
        x = self.model.layer4(x)
        x = self.model.avgpool(x)

        x = x.view(x.size(0), -1)
        x = nn.functional.relu(self.fc1(x))
        x = self.dropout(x)
        x = nn.functional.softmax(self.fc2(x), dim=1)

        return x

JakeAndFinn · February 5, 2020, 5:29am

Thank you that is a good way I think. I will try it using the functional approach as see where that gets me. I appreciate it.

JakeAndFinn · February 5, 2020, 5:30am

This is what I would want, But I really dont get the forward method for it. Could you explain it a little more detail. From what I gathered you are pushing x through the resnet50 model. But what is the layer1-4?
Thank you

ptrblck · February 5, 2020, 3:26pm

Don’t use softmax, if you are using a multi-class criterion like nn.CrossEntropyLoss (expects logits) or nn.NLLLoss (expects log_softmax).

vgsprasad · February 6, 2020, 5:06am

ResNet-50 has 4 bottleneck layers, which are represented by layer 1 to 4. Please check the paper on Resnet available at https://arxiv.org/pdf/1512.03385.pdf. And also refer, Class definition of ResNet in PyTorch available at https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py.

As pointed out by @ptrblck, softmax output should not be taken while training the network. It is used only in inference/validation.

JakeAndFinn · February 6, 2020, 11:21pm

Thank you @ptrblck and @vgsprasad
I got it to work and I did remove the softmax (thank you ptrbick for the explanation for that)
This is how my forward feed is now

  def forward(self, x):
        x = self.model.conv1(x)
        x = self.model.bn1(x)
        x = self.model.relu(x)
        x = self.model.maxpool(x)

        x = self.model.layer1(x)
        x = self.model.layer2(x)
        x = self.model.layer3(x)
        x = self.model.layer4(x)
        x = self.model.avgpool(x)

        x = x.view(x.size(0), -1)
        x = nn.functional.relu(self.fc1(x))
        x = self.dropout(x)
        #x = nn.functional.softmax(self.fc2(x), dim=1)
        x = self.fc2(x)
        return x

And it even works better than the one I did in keras.
I really appreciate your advice.