Semantic Segmentation Classes

Hi, community,

I am trying to train my Unet model with a multiclass dataset. My ground-truth are grayscale, not normalized and my masks are grayscale in range (0-3) with pixels equal to each class (0=background, 1=tissue1, 2=tissue2, 3=tissue3). In this case, should I make a mapping of the classes? How can I use the softmax and argmax in the evaluation?

I am asking this because my model is predicting very badly. With Adam optimizer, he is all black until epoch 900, at 900 he predicts the background and the perfect format of the tissue3 (but with the pixels value changed, as pixel=1). With SGD optimizer he stars with a reasonable segmentation, but in the 200 epoch, he is all background. I have no idea what is happening. I hope someone could help me.

Best Regards,
Giulia

It seems your target segmentation maps are already properly mapped.
As a quick test could you check the unique values via print(target.unique()) and make sure that only the valid class indices are returned?

Assuming your model outputs [batch_size, nb_classes, height, width] then preds = torch.argmax(output, dim=1) would give you the predicted class indices.

Hi @ptrblck ,

Thank you for replying. I check it, and it is incorrect (value= [0,1]). But this problem only occurs after transforming the targets to long tensor. Before this step the unique values are correct (0, 0.333, 0.666, 1). Observation: I divide the pixels by 3, to normalize them.

Why does this change when transform to long tensor is occurring? How can I solve it?

Best,
Giulia

The output is expected to have the values [0, 1] since you are rounding [0, 0.333, 0.666, 1] to long().
Given that the target already seemed to contain the right class indices you should not normalize it additionally or why would you want to change/normalize its values?

I did this normalization process because the ToPILImage is creating the different output, as you are passing a FloatTensor , which is assumed to be in the range [0, 1] , while yours has the range [0, 4*] . Pixel values changes when apply transform.ToTensor - #6 by ptrblck

When I did not normalize, the ToPILImage set 0 to 0, and 1,2, and 3 to 1.

And the target unique values should not display like [0, 0.333, 0.666, 1]? Instead of [0,1]?

I might not fully understand the current use case.
Are you using ToPILImage on the model output to visualize it or on the target tensors?
In case it’s just to visualize the target, you might scale the values to [0, 0.33,0.66, 1]. However, don’t transform the target during training and keep the values as the class indices [0, 1, 2, 3].

Okay, thank you very much! Now the targets look alright.

“Target Unique tensor([0, 1, 2, 3], device=‘cuda:0’)”

Best,
Giulia

Hey one more doubt,

Based on this softmax example.

def check_accuracy(loader, model, num_labels, time=0, device=DEVICE):
    num_correct = 0
    num_pixels = 0
    dice_score = 0
    
    model.eval()

    with torch.no_grad():
        for x, y in loader:
            x = x.to(device)
            preds = model(x)
            y = y.to(device)

            lossfn = nn.CrossEntropyLoss()
            celoss = lossfn(preds, y.long())

            #y = y.unsqueeze(1)

            preds_labels = torch.argmax(preds, dim=1)
    
            num_correct += (preds_labels == y).sum()
            num_pixels += torch.numel(preds_labels)
     
            dice_score = dice_loss(y, preds_labels, num_labels)

    acc = evaluate_segmentation(preds_labels, y, 4, score_averaging='weighted')
    acc.append(dice_score)
    acc.append(celoss.cpu().numpy())

    print_and_save_results(num_correct, num_pixels, acc, time)  

    model.train()

def save_predictions_as_imgs(loader, model, epoch, folder="data/predictions/", device=DEVICE):
    print("=> Saving predictions as images")
    model.eval()
    with torch.no_grad():    
        for idx, (x, _) in enumerate(loader):
            x = x.to(device)
            preds = model(x)
            preds_labels = torch.argmax(preds, dim=1)
           # preds = torch.log_softmax(model(x), 1)
           # preds_labels = torch.argmax(preds, 1)
            
            preds_labels = label_to_pixel(preds_labels)

            now = datetime.now().strftime("%Y%m%d_%H%M%S")

            save_image(preds_labels, f"{folder}{now}_pred_e{epoch}_i{idx}.png")

            # preds_labels = ToPILImage()(preds_labels).convert("L")
            # preds_labels.save(f"{folder}{now}_pred_e{epoch}_i{idx}.png")
            
    model.train()


def save_validation_as_imgs(loader, folder="data/predictions/", device=DEVICE):
    print("=> Saving predictions as images")

    for idx, (_, y) in enumerate(loader):
        y = y.to(device)
        val = y.unsqueeze(1)

        now = datetime.now().strftime("%Y%m%d_%H%M%S")
        
        save_image(val, f"{folder}{now}_val_i{idx}.png")
        #val = ToPILImage()(y).convert("L")
        #val.save(f"{folder}{now}_val_i{idx}.png")

Would this code be correct?

I’m not seeing a softmax usage besides this commented out line:

           # preds = torch.log_softmax(model(x), 1)
           # preds_labels = torch.argmax(preds, 1)

which would not be needed before applying argmax, so you could remove it.

Besides that the code looks good.
I would probably not recreate the criterion inside the DataLoader loop and just reuse it instead.
Also, I don’t know how the dice_loss etc. is implemented and you could check the loss and accuracy calculations with a known example just as a smoke test.