Inputting an image to a bunch of stacked ResNet blocks vs stacked Convolutional blocks

Mona_Jalal · August 24, 2021, 3:37am

I am confused as to why we want to feed an image to a set of stacked resnet blocks versus feeding it to a bunch of convolutional blocks as we do so traditionally?

Also, how should we stack resnet blocks to each other? Is there any code sample or paper that does this?

I drew this diagram for illustration:

ptrblck · August 24, 2021, 4:15am

Wouldn’t this be the classical ResNet architecture as seen e.g. here?
Depending on the flavor of the ResNet you could be using Bottleneck or BasicBlock as part of the “ResNet Blocks”.

Mona_Jalal · August 24, 2021, 4:17am

I am actually not sure what is meant by “ResNet block” here.

ptrblck · August 24, 2021, 4:19am

If this model is taken from another research paper, I would expect a definition of the “ResNet Block”, but my best guess would be e.g.:

model = models.resnet18()
print(model.layer1)
> Sequential(
  (0): BasicBlock(
    (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (1): BasicBlock(
    (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
)

Mona_Jalal · August 24, 2021, 4:21am

Thanks a lot for your response. Just checking if is this your definition of “ResNet block” as well?

https://d2l.ai/chapter_convolutional-modern/resnet.html

ptrblck · August 24, 2021, 4:25am

The figure would represent the BasicBlock.