Building ResNet from scratcg-something does not work

Hello, I try to build ResNet18 from scratch (similar to the archtecture of the library), though for some reason it does not behave as I want to.
In the first residual block (ResidualBlock), I dont want the downsampling layers to exist however, when I print the model, they are shown in the architecture.
Am I missing something?

Here is the residual block:

class ResidualBlock(nn.Module):
def init(self, in_channels, out_channels, stride = 1, downsample = False):
super(ResidualBlock, self).init()

    self.conv = nn.Sequential(
                    nn.Conv2d(in_channels, out_channels, kernel_size = 3, stride = stride, padding = 1),
                    nn.BatchNorm2d(out_channels),
                    nn.ReLU(),
                    nn.Conv2d(in_channels, out_channels, kernel_size = 3, stride = stride, padding = 1),
                    nn.BatchNorm2d(out_channels))
    
    self.downsample = downsample
    print('--------------', self.downsample)
    
    self.downsample_layers = nn.Sequential(
                    nn.Conv2d(in_channels, out_channels, kernel_size = 1, stride = 2),
                    nn.BatchNorm2d(out_channels))
    
    self.relu = nn.ReLU()
    self.out_channels = out_channels

def forward(self, x):
    residual = x
    out = self.conv(x)
    if self.downsample:
        residual = self.downsample_layers(x)
    out += residual
    out = self.relu(out)
    return out

Here is the ResNet:

class ResNet(nn.Module):
def init(self, block, layers, num_classes = 10):
super(ResNet, self).init()

    self.inplanes = 64
    
    self.conv1 = nn.Sequential(
                    nn.Conv2d(3, 64, kernel_size = 7, stride = 2, padding = 3),
                    nn.BatchNorm2d(64),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size = 3, stride = 2, padding = 1))
    
    self.layer0 = self._make_layer(block, 64, layers[0], stride = 1, downsample = False)
    self.layer1 = self._make_layer(block, 128, layers[1], stride = 2, downsample = True)
    self.layer2 = self._make_layer(block, 256, layers[2], stride = 2, downsample = True)
    self.layer3 = self._make_layer(block, 512, layers[3], stride = 2, downsample = True)
    
    self.avgpool = nn.AvgPool2d(7, stride=1)
    self.fc = nn.Linear(512, num_classes)
    
def _make_layer(self, block, planes, blocks, stride=1, downsample = False):
    
    layers = []

    if downsample:
        layers.append(block(self.inplanes, planes, stride, downsample=True))
    
    else:
        layers.append(block(self.inplanes, planes, stride, downsample=False))
    
    self.inplanes = planes
    

    return nn.Sequential(*layers)


def forward(self, x):
    x = self.conv1(x)
    x = self.maxpool(x)
    x = self.layer0(x)
    x = self.layer1(x)
    x = self.layer2(x)
    x = self.layer3(x)

    x = self.avgpool(x)
    x = x.view(x.size(0), -1)
    x = self.fc(x)

    return x

I have based my code on this article: Writing ResNet from Scratch in PyTorch which works but I wanted to changed it a bit.
Thank you

Im not sure if I understand your use case correctly, but you are explicitly initializing self.downsample_layers which is why they are showing up when you are printing the model.

What I was aiming is, when self.downsample==True, then residual= self.downsample_layers. Can you please explain me where it is initiated?

It’s initialized in the __init__:

    self.downsample_layers = nn.Sequential(
                    nn.Conv2d(in_channels, out_channels, kernel_size = 1, stride = 2),
                    nn.BatchNorm2d(out_channels))

and you could guard it with an if-condition checking for self.downsample.

Yes, but how is this added to the structure?
I mean, the network architecture is defined in the def forward(self, x):, no?
What I mean is that, why the if statement is ‘ignored’?

The forward pass is defined in the forward while the __init__ is responsible to initialize all submodules, parameters, and buffers.
Once you’ve created a layer in the __init__ method, it’ll be registered and will show up when you print the model. In the forward you can then use these layers (or could of course skip their usage).