I adopt residual block and stack it many times to construct my network.Thus,I define a function to return a list to build my residual block,
def make_resblock(in_channels,out_channels,kernel_size,stride,padding):
res=[
nn.Conv2d(in_channels,out_channels,kernel_size,stride,padding),
nn.BatchNorm2d(out_channels),
nn.ReLU()
]
res+=res
return nn.Sequential(*res)
# in init function
self.res=make_resblock(256,256,3,1,1)
It seems runs well during the model construction process.But it makes the network untrainable,loss in backpropagation is blocked by this residual block,it never descents.When I remove it in forward pass ,the network is trainable. And when I replace my original function with this,
self.res=nn.Sequential(
nn.Conv2d(256, 256, 3, 1,1),
nn.BatchNorm2d(256),
nn.ReLU(),
nn.Conv2d(256, 256, 3, 1, 1),
nn.BatchNorm2d(256),
nn.ReLU()
)
It works,the residual block runs well.I want to know where to modify and the reason,how to modify my function, since I will use the residual block many times, I do not want to redeclare it when i use.