I tried to use DeepLabV3+ head for segmentation, as a part of a new architecture model, and I couldn’t train it. while debugging I found out that there is an problem that seems to be inherent in this head class.
the head contains 4 ASSPconv classes, and one ASSPpool classes. the last one contains the following layers:
class ASPPPooling(nn.Sequential): def __init__(self, in_channels, out_channels): super(ASPPPooling, self).__init__( nn.veAdaptiAvgPool2d(1), nn.Conv2d(in_channels, out_channels, 1, bias=False), nn.BatchNorm2d(out_channels), nn.ReLU()) def forward(self, x): size = x.shape[-2:] for mod in self: x = mod(x) return F.interpolate(x, size=size, mode='bilinear', align_corners=False)
the bug is hiding in the BN layer, which couldn’t get 1*1 shape tensor in “train mode”. but the first AdaptiveAvgPool2d layer is forcing that shape.
I appologize about the weak English,
and I would aprritiate your help!