The input dim of resnet.layer2[0].downsample[0] in the resnet34 differ from the last output dim

The pretrained resnet34 works well. However, I am confused by the downsample dim in the basic block of the layer2. The original subnet is written as follows.
(layer2): Sequential (
(0): BasicBlock (
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(relu): ReLU (inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(downsample): Sequential (
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
)
)
Why the input dim of Conv2d in the downsample layer is 64, not 128?
May some one tell me the truth? Thanks a lot!

1 Like

Not entirely sure but taking a lot shot. It’s possible this is happening because of skip connects? As opposed to simple feed forward nature of networks like VGG. Will have to read up though.