Not able to train Fully Convolutional Network for image segmentation

I set up a really small toy Full Connected Network for image segmentation on Pascal VOC . The architecture looks like following ==> Conv - Conv- Conv- Transpose conv-Transpose conv-Transpose conv .I set up a per pixel cross Entropy Loss and trained it with Adam . But i am not able to lower down loss . Can anyone tell what i have fundamentally done wrong ,Is there something wrong with the way i set up the loss?.The code is below.

dataset1 = dset.VOCSegmentation(root = ‘/.data’ , download = True , transform = T.Compose([T.Resize((224,224)),T.ToTensor()])
,target_transform = T.Compose([T.Resize((224,224)),T.ToTensor()]))
train_loader = DataLoader(dataset1, batch_size = 32 )

class Model(nn.Module):
def init(self):
super().init()
self.conv1 = nn.Conv2d(3,64,3,2,1)
self.conv2 = nn.Conv2d(64,128,3,2,1)
self.conv3 = nn.Conv2d(128,256,3,2,1)

    self.deconv1 =nn.ConvTranspose2d(256,128,4,2,1)
    self.deconv2 = nn.ConvTranspose2d(128,64,4,2,1)
    self.deconv3 = nn.ConvTranspose2d(64,64,4,2,1)
    
    self.finalconv = nn.Conv2d(64,22,3,1,1)
    
def forward(self,x):
    x = self.conv1(x).clamp(min=0)
    x = self.conv2(x).clamp(min=0)
    x = self.conv3(x).clamp(min=0)
    
    x = self.deconv1(x).clamp(min=0)
    x = self.deconv2(x).clamp(min=0)
    x = self.deconv3(x).clamp(min=0)
    
    return self.finalconv(x)

for epoch in range(num_epoch):
for image,segments in enumerate(train_loader):

    segments = segments*255
    segments = segments.int().long()
    segments[segments==255] = 21
    
    scores = model(image)
    
    scores = scores.permute(0,2,3,1).reshape(-1,22)
    
    segments = segments.reshape(segments.shape[0],segments.shape[2],segments.shape[3]).reshape(-1)
    
    loss = F.cross_entropy(scores,segments)
    
    loss.backward()
    
    optimizer.step()
    
    optimizer.zero_grad()
    
    print(loss.item())

Could you explain what these lines of code are doing and what values are stored in segments afterwards?

    segments = segments*255
    segments = segments.int().long()
    segments[segments==255] = 21

Based on your code you might be normalizing the targets and the mentioned lines of code might try to denormalize them.

If you’ve checked the segments and made sure thay contain the right values, I would recommend to try to overfit a small data sample (e.g. just 10 samples) and make sure your mode is able to do so.

The original segmented image returned in dataset(w/o ToTensor Transform) is a segmentation mask having a value of 0-20(category) or 255(background) for each pixel . But with ToTensor() Transform It will normalize the mask (dividing by 255) and But to use it for cross entropy loss i need to denormalize and in addition i have replaced the values of background (255) with extra category (21). Segments seems to contain right value , I have tried Overfitting a small dataset ,Tuning learning rate etc also , But It seems there is some basic flow in set up that loss is not decaying .

The model itself seems to be correct, as I can perfectly overfit a random input and target batch of 10 samples:

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3,64,3,2,1)
        self.conv2 = nn.Conv2d(64,128,3,2,1)
        self.conv3 = nn.Conv2d(128,256,3,2,1)
        
        self.deconv1 =nn.ConvTranspose2d(256,128,4,2,1)
        self.deconv2 = nn.ConvTranspose2d(128,64,4,2,1)
        self.deconv3 = nn.ConvTranspose2d(64,64,4,2,1)
        
        self.finalconv = nn.Conv2d(64,22,3,1,1)
            
    def forward(self,x):
        x = self.conv1(x).clamp(min=0)
        x = self.conv2(x).clamp(min=0)
        x = self.conv3(x).clamp(min=0)
        
        x = self.deconv1(x).clamp(min=0)
        x = self.deconv2(x).clamp(min=0)
        x = self.deconv3(x).clamp(min=0)
        
        return self.finalconv(x)

model = Model()
data = torch.randn(10, 3, 24, 24)
target = torch.randint(0, 22, (10, 24, 24))
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

for epoch in range(200):
    optimizer.zero_grad()
    output = model(data)
    loss = criterion(output, target)
    loss.backward()
    optimizer.step()
    print('epoch {}, loss {}'.format(epoch, loss.item()))


pred = torch.argmax(output, 1)
acc = (pred == target).float().mean()
print(acc)
> tensor(1.)
1 Like