This question is about Semantic segmentation

I make a net for semantic segmentation . The loss function follow:
class CrossEntropyLoss2d(nn.Module):

def __init__(self, weight=None):
    super().__init__()

    self.loss = nn.NLLLoss2d(weight)

def forward(self, outputs, targets):
    return self.loss(F.log_softmax(outputs), targets)

The value of loss function descend to 0.08,and the begining of training, the value is about 3.0

The dataset is pascal voc

My net output size is ( batch_size ,22,256,256) ,22 is class numbers.

Why I choose one channel of output, then I use transforms.ToPILImage(), the image.show() is a noisy.

I think the net is closed to convergence,but the decode image is a noisy
Please help me

I think in a output channel, the value of every pixels is the probability that the exact pixel belongs to exact class. So even if in that channel, a block of pixels belongs to the same class, but the values of the block will be different. So it may be look like noise.

I think you need save the index of the max probability to every pixels, then assign colors to pixel. For example, we can set red color to the pixels which max probability index is 12.

:slight_smile:

Oh thank you for your help