Semantic segmentation for binary classification issue

Nishanth_Sasankan · July 18, 2019, 8:44am

I am very new to Pytorch and Deep Learning in general. I have a set of grayscale images which I convert to a 3 channel images by repeating the first channel two more times. I am using the fcn_resnet101 from the pytorch models. The model only predicts one class for all images. Am I missing something from the methodology below.

import torch
import torchvision
import loader
from loader import DataLoaderSegmentation
import torch.nn as nn
import torch.optim as optim
import numpy as np
from torch.utils.data.sampler import SubsetRandomSampler
batch_size = 1
validation_split = .2
shuffle_dataset = True
random_seed= 66

n_class    = 2
num_epochs = 1
#lr         = 1e-4
#momentum   = 0
w_decay    = 1e-5
step_size  = 50
gamma      = 0.5

traindata = DataLoaderSegmentation('/home/ubuntu/Downloads/Brain/test0716')
dataset_size = len(traindata)
indices = list(range(dataset_size))
split = int(np.floor(validation_split * dataset_size))
#if shuffle_dataset:
np.random.seed(random_seed)
np.random.shuffle(indices)
train_indices = indices[split:]
val_indices = indices[:split]

#train_sampler = SubsetRandomSampler(train_indices)
#valid_sampler = SubsetRandomSampler(val_indices)
trainloader = torch.utils.data.DataLoader(traindata, batch_size=batch_size)
validation_loader = torch.utils.data.DataLoader(traindata, batch_size=batch_size)

#trainloader = torch.utils.data.DataLoader(traindata, batch_size=1, shuffle=False, num_workers=2)

model = torchvision.models.segmentation.fcn_resnet101(pretrained=False, progress=True,aux_loss=None, num_classes=2).cuda()
criterion = nn.CrossEntropyLoss().cuda()
#optimizer = optim.RMSprop(model.parameters(), lr=lr, momentum=momentum, weight_decay=w_decay)
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.train()


for iter in range(num_epochs):
    print(iter)	
    for (i,l) in trainloader:
        i= i.to(device)
        l = l.to(device=device, dtype=torch.int64)
        outt = model(i)
        loss = criterion(outt['out'], l.squeeze(1))
        #print(loss)
        loss.backward()
        optimizer.step()
torch.save(model, '/home/ubuntu/Downloads/Brain/test0716/mod.pth)

Thanks for all your help

ptrblck · July 18, 2019, 11:36am

In your training loop you are not zeroing out the gradients, which will thus accumulate them in each iteration.
Add optimizer.zero_grad() at the beginning of your DataLoader loop:

for (i, l) in trainloader:
    optimizer.zero_grad()
    i = ...

Besides that the code looks fine.
Let me know, if that helps.

Nishanth_Sasankan · July 19, 2019, 2:38am

Thank you for your quick response. I included your suggestion and rerun the code. I am testing the model with an image I know that has 2 classes

import torchvision
import numpy as np
from torch.utils.data.sampler import SubsetRandomSampler
from PIL import Image
import matplotlib.pyplot as plt
import torchvision.transforms as T

def decode_segmap(image, nc=2):
    label_colors = np.array([(0, 0, 0),(255,255,255)])
    r = np.zeros_like(image).astype(np.uint8)
    g = np.zeros_like(image).astype(np.uint8)
    b = np.zeros_like(image).astype(np.uint8)

    for l in range(0, nc):
        idx = image == l
        r[idx] = label_colors[l, 0]
        g[idx] = label_colors[l, 1]
        b[idx] = label_colors[l, 2]

    rgb = np.stack([r, g, b], axis=2)
    return rgb
#import pic directly, webmethod check
import cv2
img = cv2.imread('/home/ubuntu/Downloads/Brain/test0716/train/slice_src_BN01002_032.png')
label = cv2.imread('/home/ubuntu/Downloads/Brain/test0716/trainannot/slice_src_BN01002_032.png')
plt.imshow(img); plt.show()
plt.imshow(label); plt.show()
trf = T.Compose([T.ToTensor()])
inp = trf(img).unsqueeze(0).cuda()
fcn = torch.load('/home/ubuntu/Downloads/Brain/test0716/mod.pth')
fcn.eval()
sam_out = fcn(inp)['out']
om = torch.argmax(sam_out.squeeze(), dim=0).cpu().numpy()
rgb = decode_segmap(om)
print(rgb.shape)
print (np.unique(rgb))
plt.imshow(rgb);

This is the code I use to visualize the result, The prediction gives me an image with only 1 label. I greatly appreciate your help. Please let me know if you have any suggestions.
Thanks
Nishanth

ptrblck · July 19, 2019, 9:34am

The code for visualization should generally work:

output = torch.randn(1, 2, 224, 224)
pred = torch.argmax(output.squeeze(), 0).numpy()
seg_output = decode_segmap(pred)

plt.imshow(seg_output)

This code snippet gives a random segmentation output.

Did you observe your training and validation loss and accuracy?
Your model might be overfitting to a single class, but that should already be visible during training in your validation loss.

Nishanth_Sasankan · July 19, 2019, 1:33pm

Thank you for your continued response. I tried your code snippet and it did give me a random segmented output.

As for your question

“Did you observe your training and validation loss and accuracy?”

I am not sure how I can visualize training, validation loss and accuracy during training. Also, how do I implement validation and validation loss during the training epoch?

How do I know if the model is overfitting for one of the class? Also, how do I correct for it?

Thanks
Nishanth

ptrblck · July 19, 2019, 11:12pm

Usually you would create a train function to train the model for a complete epoch using the training dataset. Afterwards you could run the evaluation on the validation dataset.
Have a look at the ImageNet exmple to see how train and validate are defined.
The same code uses this code to print the progress.

One signal of overfitting is when you observe a decreasing training loss and a static or increasing validation loss.
Further training at this point usually won’t give you any benefits and you would stop at this point.
You would counter it by e.g.

using (or collecting) more data
remove “capacity” of the model, e.g. by reducing the number of parameters
add regularization (e.g. dropout)

You can find way more information and a better explanation in Chapter 5, DeepLearningBook.

Nishanth_Sasankan · July 22, 2019, 4:36pm

Thank you for your response. I noticed that my Dataset is heavily imbalanced, there are way more pixels representing the backgrounds than the organ (label I am interested in). I have 18000 images of which only 600 have the organ class I am looking for. Is it possible that this is the reason why only one of the classes is learned?

Thanks
Nishanth

ptrblck · July 22, 2019, 4:39pm

Yes, that could be possible.
You could try to oversample the minority class using e.g. a WeightedRandomSampler.

Nishanth_Sasankan · August 2, 2019, 7:08pm

Hi @ptrblck, I tried working with a more even dataset with balanced classes. I also tried playing around with the weight_decay parameter in the optimizer. However, I am still running into the issue where my model is only outputting 1 class. I finally decided to just increase the epochs and overfit the model with training data and perform prediction on a training image. It turns out I am still only getting 1 class when I am supposed to get 2 classes. I tried scouring the internet for a similar issue, but couldn’t find one. Any and all help is very much appreciated.

Thanks
Nishanth

I have provided a detailed explanation in the following post