Add simple neural net on top of Resnet

Chris_Oosthuizen · December 21, 2018, 1:47pm

Using Resnet34 to calculate multi-classification problem I want to add 3x hidden layers on top of the Resnet.

To do this I defined a new function _hidden() to add the new hidden layers, but this approach is not working for me.

class Resnet4Channel(nn.Module):
    def __init__(self, encoder_depth=34, pretrained=True, num_classes=28):
        super().__init__()

        encoder = RESNET_ENCODERS[encoder_depth](pretrained=pretrained)
        num_final_in = encoder.fc.in_features
        encoder.fc = nn.Linear(num_final_in, num_classes)
        
        self.num_classes = num_classes
        # we initialize this conv to take in 4 channels instead of 3
        # we keeping corresponding weights and initializing new weights with zeros
        # this trick taken from https://www.kaggle.com/iafoss/pretrained-resnet34-with-rgby-0-460-public-lb
        w = encoder.conv1.weight
        self.conv1 = nn.Conv2d(4, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.conv1.weight = nn.Parameter(torch.cat((w,torch.zeros(64,1,7,7)),dim=1))
        
        self.bn1 = encoder.bn1
        self.relu = nn.ReLU(inplace=True) 
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

        self.layer1 = encoder.layer1
        self.layer2 = encoder.layer2
        self.layer3 = encoder.layer3
        self.layer4 = encoder.layer4
        
        self.avgpool = encoder.avgpool
        self.fc = encoder.fc     
  
    def _hidden():
        n_in, n_h1,n_h2,n_h3, n_out = self.num_classes, 32, 64,64, self.num_classes
        model = nn.Sequential(nn.Linear(n_in, n_h1),
                              nn.ReLU(),
                              nn.Linear(n_h1, n_h2),
                              nn.ReLU(),
                              nn.Linear(n_h2, n_h3),
                              nn.ReLU(),
                              nn.Linear(n_h3, n_out),
                              nn.Softmax())
        return model 
    
        
        
    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        x = self._hidden(x)
        return x

ptrblck · December 22, 2018, 8:51am

Do you get an error or what is now working in your code?
If your model is just not training, it’s most likely due to the last softmax layer in your model.
It looks like you are dealing with a classification use case, i.e. you are most likely using nn.CrossEntropyLoss or nn.NLLLoss.
In the former case you should just pass the logits to the criterion (i.e. no non-linearity at the end of your model), while nn.NLLLoss needs nn.LogSoftmax() as the last layer.

Chris_Oosthuizen · December 27, 2018, 7:55am

@ptrblck Thanks, the comment on the softmax re: logits was very helpful. Apologies for weak description on my problem. My model is training better now.

My biggest problem at the moment is that the layers in _hidden() are not added to the model when I call the model children, despite it being included in the forward function.

ptrblck · December 27, 2018, 1:03pm

The _hidden Sequential module will be created on the fly in your forward. Since you don’t assign it as a class attribute, this submodule won’t be stored.
What is your current use case? Would it work, if you initialize this module in your __init__ like the other ones?
If that’s not possible for whatever reason, you could still create the submodule in your first forward pass, assign it as a member, and check if it’s already created in the subsequent forward calls.

Chris_Oosthuizen · December 27, 2018, 6:51pm

I am building a multi classification model of images with 4 channels, I believe that the classes are correlated (i.e. occur together in patterns.) So if class 1 is present there is a greater chance of class 2 being present. It is my hypothesis that adding a simple fully connected neural layers at the back of the model with the predicted classes as the input these relationships will be teased out and improve the accuracy of the model.

I’ve tried initializing the module in init but the following error occurs once I start training. I get this error anytime I try and add a layer to init

RuntimeError: Given input size: (512x2x2). Calculated output size: (512x-4x-4). Output size is too small at /opt/conda/conda-bld/pytorch-nightly_1542185950098/work/aten/src/THNN/generic/SpatialAveragePooling.c:64

Code below:

import torch
import torch.nn as nn
import torchvision
from torchvision import models


RESNET_ENCODERS = {
    34: torchvision.models.resnet34,
    50: torchvision.models.resnet50,
    101: torchvision.models.resnet101,
    152: torchvision.models.resnet152,
}


class Resnet4Channel(nn.Module):
    def __init__(self, encoder_depth=34, pretrained=True, num_classes=28):
        super().__init__()

        encoder = RESNET_ENCODERS[encoder_depth](pretrained=pretrained)
        num_final_in = encoder.fc.in_features
        encoder.fc = nn.Linear(num_final_in, num_classes)
        
        self.num_classes = num_classes
        # we initialize this conv to take in 4 channels instead of 3
        # we keeping corresponding weights and initializing new weights with zeros
        # this trick taken from https://www.kaggle.com/iafoss/pretrained-resnet34-with-rgby-0-460-public-lb
        w = encoder.conv1.weight
        self.conv1 = nn.Conv2d(4, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.conv1.weight = nn.Parameter(torch.cat((w,torch.zeros(64,1,7,7)),dim=1))
        
        self.bn1 = encoder.bn1
        self.relu = nn.ReLU(inplace=True) 
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

        self.layer1 = encoder.layer1
        self.layer2 = encoder.layer2
        self.layer3 = encoder.layer3
        self.layer4 = encoder.layer4
        
        self.avgpool = encoder.avgpool
        self.fc = encoder.fc     
        self.hidden = self._hidden()
        
  
    def _hidden(self):
        layers = []
   
        n_in, n_h1,n_h2,n_h3, n_out = self.num_classes, 32, 64,64, self.num_classes
        
        layers.append(nn.Linear(n_in, n_h1))
        layers.append(nn.ReLU(inplace=True) )
        layers.append(nn.Linear(n_h1, n_h2))
        layers.append(nn.ReLU(inplace=True) )
        layers.append(nn.Linear(n_h2, n_h3))
        layers.append(nn.ReLU(inplace=True) )
        layers.append(nn.Linear(n_h3, n_out))
        
        return nn.Sequential(*layers)
                              
      
        
    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        x = self.hidden(x)
        return x

ptrblck · December 28, 2018, 4:04pm

The input to your average pooling layer is too small.
You could try to remove some pooling layers or change the hyperparameters of some conv/pooling layers, such that your activation output will be increased.
Alternatively, you could also change the hyperparameters of your average pooling layer.