Translating Architecture to Code

sonamghosh · November 26, 2018, 9:00am

So I’m new to Deep Learning and I’m trying to create the following CNN architecture on PyTorch:

This is what I have for code so far:

import torch.nn as nn 
from torch.autograd import Variable
import torch.nn.functional as F


class CNet(nn.Module):
    def __init__(self, num):
        # Input x is (128, 128, 1
        super(HeartNet, self).__init__()

        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(64, eps=0.001),
            nn.Conv2d(3, 64, kernel_size=3, stride=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(64, eps=0.001),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(3, 128, kernel_size=3, stride=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(128, eps=0.001),
            nn.Conv2d(3, 128, kernel_size=3, stride=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(128, eps=0.001),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(3, 256, kernel_size=3, stride=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(256, eps=0.001),
            nn.Conv2d(3, 256, kernel_size=3, stride=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(256, eps=0.001),
            nn.MaxPool2d(kernel_size=2, stride=2)
            )

        self.classifier = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(16*16*256, 2048),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(2048, eps=0.001)
            nn.Linear(2048, 8)
            )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), 16 * 16 * 256)
        x = self.classifier(x)
        return x

Am I on the right track here? I also found a keras equivalent for the architecture and I was confused on how to translate the last portion of the fully connected layer into PyTorch, this is how it looks like in Keras below:

model.add(MaxPool2D(pool_size=(2, 2), strides= (2,2)))

model.add(Flatten())

model.add(Dense(2048))

model.add(keras.layers.ELU())

model.add(BatchNormalization())

model.add(Dropout(0.5))

model.add(Dense(7, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Any help or hints in the right direction is helpful, thank you!

ptrblck · November 26, 2018, 12:38pm

In your current implementation, the in_channels are incorrectly defined.
Since the input image seems to have one channel, you should set in_channels=1:

nn.Conv2d(in_channels=1, out_channels=64, kernel_size=3, stride=1)

The same applies for all other conv layers. E.g. the second conv layer should have 64 input channels, since the first one returned this number of channels.

Another issue is that the spatial size of the activation does not change after each conv layer, which means that padding was used. For a kernel size of 3, you would have to use padding=1 to keep the same shape.

The classifier part of your model looks good. If you are dealing with a multi-class classification use case, you could just use nn.CrossEntropyLoss as the criterion and create the optimizer as was done in the Keras code.
I’m not sure, why Keras uses 7 output neurons.

sonamghosh · November 27, 2018, 5:02am

Thanks! Seems like the person who made the Keras version decided to do 7 classes instead of the original 8 classes that were used in the paper. Also other thing, the paper mentioned to do xavier initialization and I read around to see how to apply it on PyTorch and it says to either do it on every individual layer or apply directly on the linear transformation. So I tried doing it on the linear transformation and got the following error:

AttributeError: 'Linear' object has no attribute 'ndimension'

This is what my code looks like so far:

import torch.nn as nn 
import torch.utils.model_zoo as model_zoo
from torch.autograd import Variable
import torch.nn.functional as F


class CNet(nn.Module):
    def __init__(self, num_classes=7):
        # Input x is (128, 128, 1
        super(HeartNet, self).__init__()

        self.features = nn.Sequential(
            nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(64, eps=0.001),
            nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(64, eps=0.001),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(128, eps=0.001),
            nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(128, eps=0.001),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(256, eps=0.001),
            nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(256, eps=0.001),
            nn.MaxPool2d(kernel_size=2, stride=2)
            )

        self.classifier = nn.Sequential(
            nn.Dropout(0.5),
            nn.init.xavier_uniform(nn.Linear(16*16*256, 2048)),
            nn.ELU(inplace=True),
            nn.BatchNorm2d(2048, eps=0.001),
            nn.init.xavier_uniform(nn.Linear(2048, num_classes))
            )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), 16 * 16 * 256)
        x = self.classifier(x)
        return x

# Testing
model = CNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

ptrblck · November 27, 2018, 9:49am

You should call the init functions on the specific parameter, i.e. weight or bias.
In this case I would just initialize it after creating the layers or in a separate method:

self.classifier = nn.Sequential(
    nn.Dropout(0.5),
    nn.Linear(16*16*256, 2048),
    nn.ELU(inplace=True),
    nn.BatchNorm2d(2048, eps=0.001),
    nn.Linear(2048, num_classes)
)
        
nn.init.xavier_uniform_(self.classifier[1].weight)
nn.init.xavier_uniform_(self.classifier[4].weight)