Hello there.
I’m currently doing a deep learning course and trying to implement a resnet, while doing so, I’ve been bumping into this implementation in github:
class BasicBlock(nn.Module):
def __init__(self, in_planes, planes):
super(BasicBlock, self).__init__()
self.conv1 = nn.Conv2d(in_planes, planes, 3, padding = 1)
self.bn1 = nn.BatchNorm2d(planes)
self.conv2 = nn.Conv2d(planes, planes, 3, padding = 1)
self.bn2 = nn.BatchNorm2d(planes)
self.shortcut = nn.Sequential()
if (in_planes != planes):
self.shortcut = nn.Sequential( nn.Conv2d(in_planes, planes, 3, padding = 1),
nn.BatchNorm2d(planes))
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = self.bn2(self.conv2(out))
out += self.shortcut(x) #for the input
out = F.relu(out)
return out
I don’t understand the shortcut, the if statements make it executable only if the channels between two adjacent layers are the same, however, if they’re not it just executes the regular basic block again.
However, later in the code, they execute the resnet like that:
def __init__(self, in_channel, hidden_channels, num_classes):
super(SmallResNet, self).__init__()
self.conv = nn.Conv2d(in_channel, hidden_channels[0], 3, padding = 1) #first conv
self.bn = nn.BatchNorm2d(hidden_channels[0]) #then batchNorm
#now use 3 residual blocks
self.res1 = BasicBlock(hidden_channels[0],hidden_channels[1])
self.res2 = BasicBlock(hidden_channels[1],hidden_channels[2])
self.res3 = BasicBlock(hidden_channels[2],hidden_channels[3])
#now do the maxpooling
self.maxpool = nn.MaxPool2d(2, 2)
self.fc = nn.Linear(hidden_channels[3] * 16 * 16 , num_classes) #from maxpooling
And calling it with hidden channels
hidden = [16,32,64,128]
So it seems to me like missing the whole point of ResNet, because there are no ‘‘skips’’ here.
What am I seeing wrong?
Thanks.