ResNet basic modifications (multi-target + regression)

mflova · June 21, 2020, 1:16pm

Hi!
I have been studying Machine Learning for such a long time and I decided to start with Deep Learning models. I am trying to implement a regression problem (2 targets) from an BW processed image dataset that I have created.
Since I am using the ResNet architecture, I have tried to make some changes to the model, but I still have so many doubts regarding some modifications:

Regarding the input, I have make several tests with different channels and it is working properly
However, about the outputs, I have two main concerns:

First one is, how can I modify the resnet architecture in order to train a multi-target ResNet?
Second one is about regression problem. I have deleted the softmax layer at the end. However, I cannot test wether it works properly or not, since I couldn’t implement the multi-target ResNet. Will it work if the previous modification (multi-target) is done?

ResNet Model used is exactly the same found here:

import torch
import torch.nn as nn
import torch.nn.functional as F
from pytorch_fitmodule import FitModule
from torch.autograd import Variable
import numpy as np


def conv3x3(in_planes, out_planes, stride=1):
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)


class BasicBlock(FitModule):
    expansion = 1

    def __init__(self, in_planes, planes, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(in_planes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != self.expansion * planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, self.expansion * planes,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(self.expansion * planes)
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += self.shortcut(x)
        out = F.relu(out)
        return out


class ResNet(FitModule):
    def __init__(self, block, num_blocks, num_classes=10):
        super(ResNet, self).__init__()
        self.in_planes = 64

        self.conv1 = conv3x3(3, 64)
        self.bn1 = nn.BatchNorm2d(64)                                           
        self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)      
        self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)     
        self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)     
        self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)     
        self.linear = nn.Linear(512 * block.expansion, num_classes)

    def _make_layer(self, block, planes, num_blocks, stride):
        strides = [stride] + [1] * (num_blocks - 1)
        layers = []
        for stride in strides:
            layers.append(block(self.in_planes, planes, stride))
            self.in_planes = planes * block.expansion
        return nn.Sequential(*layers)

    def forward(self, x):    # add additional layers here?                                       
        x = x.float()                                              
        out = F.relu(self.bn1(self.conv1(x).float()).float())      
        out = self.layer1(out)                                      
        out = self.layer2(out)                                     
        out = self.layer3(out)                                      
        out = self.layer4(out)                                      
        out = F.avg_pool2d(out, 4)                                 
        out = out.view(out.size(0), -1)                            
        out = self.linear(out)
        return out

def ResNet34():
    return ResNet(BasicBlock, [3, 4, 6, 3])

Thank you!

ptrblck · June 22, 2020, 7:54am

I assume multi-target refers to a multi-class classification, i.e. each sample corresponds to one target only.
If that’s the case, you could create the last linear layer with out_features=nb_classes, such that each sample will yield the logits for all classes.
For the criterion you could use nn.CrossEntropyLoss, which expects the model output to be raw logits in the shape [batch_size, nb_classes] (which should be the case using the mentioned modification) and a target in the shape [batch_size], containing the class indices in the range [0, nb_classes-1].

If you are working on a regression task, you could also modify the last linear layer and set out_features=regression_features, where regression_features would refer to the number of features your model should predict.
You could add an additional activation function, but it depends on your use case.
nn.MSELoss could be used as the criterion (also nn.L1Loss etc.). The target should have the same shape as the model output in this case.