I have a segmentation network which looks like this. I preTrain(forward pass has
preTrain=True) it on the on the VOC dataset. But during pretraining only the last layer
conv1_1_2 has gradient flowing and the weights getting updated. I have put the network class and gradient flow plot below.
class Network(nn.Module): def __init__(self, num_classes): super().__init__() # pre-trained features backbone = vgg16(is_caffe=True) l7 = [('fc7', nn.Conv2d(512, 1, 1))] l8 = [('fc8', nn.Conv2d(3, 3, 1))] self.encoder = copy.deepcopy(backbone) self.conv1_1_1=nn.Sequential(OrderedDict(l7)) self.conv1_1_2=nn.Sequential(OrderedDict(l8)) def forward(self, x, weights,preTrain=True): x = self.encoder(x) x=x/x.max() if preTrain==False: # weights come another network branch only in main training suppQueryFusion = [torch.add(x,weight.repeat(1, 1, x.size(2), x.size(3))) for weight in weights] else: suppQueryFusion = [x for weight in weights] weightedFeats = [self.conv1_1_1(feat) for feat in suppQueryFusion] concatedFeats = torch.cat( weightedFeats,dim=1) normClassFeats=F.normalize(concatedFeats,p=2,dim=1) Fusion=self.conv1_1_2(normClassFeats) x=self.conv1_1_2(Fusion) return x
The method for getting the plot can be seen [Check gradient flow in network].
Could someone guide me why the gradient flow is zero and how I could overcome it?