I adopt residual block and stack it many times to construct my network.Thus,I define a function to return a list to build my residual block,
def make_resblock(in_channels,out_channels,kernel_size,stride,padding): res=[ nn.Conv2d(in_channels,out_channels,kernel_size,stride,padding), nn.BatchNorm2d(out_channels), nn.ReLU() ] res+=res return nn.Sequential(*res) # in init function self.res=make_resblock(256,256,3,1,1)
It seems runs well during the model construction process.But it makes the network untrainable,loss in backpropagation is blocked by this residual block,it never descents.When I remove it in forward pass ,the network is trainable. And when I replace my original function with this,
self.res=nn.Sequential( nn.Conv2d(256, 256, 3, 1,1), nn.BatchNorm2d(256), nn.ReLU(), nn.Conv2d(256, 256, 3, 1, 1), nn.BatchNorm2d(256), nn.ReLU() )
It works,the residual block runs well.I want to know where to modify and the reason,how to modify my function, since I will use the residual block many times, I do not want to redeclare it when i use.