How to save the predicted multi-classes segmentation outputs tensor like the single channel ground truth?

XiaoAHeng · July 24, 2019, 3:07am

In multi-classes segmentation, the shape of the model’s last layer is [batch , num_classes, H, W] , e.g. [1, 6, 512, 512],however ,the ground truth shape is [1, 1, 512, 512], how to transform the output tensor shape
into the ground truth shape and save it?
Thanks in advance!

ptrblck · July 24, 2019, 6:55pm

If you are dealing with a multi-class classification and would like to use nn.CrossEntropyLoss (or nn.NLLLoss) as your criterion, the output of your model should contain the logits (or log probabilities) as [batch_size, nb_classes, height, width], while your target should be a LongTensor containing the class indices as [batch_size, height, width].
If your target already contains these indices, you can just call target = target.squeeze(1) on it to remove the channel dimension.

XiaoAHeng · July 25, 2019, 2:38am

I have known how to do, maybe I have no clear statement, you misunderstand my question , we should use torch.argmax() on the predicted multi-classes segmentation outputs

Mario_Parreno · April 3, 2020, 8:40am

Hi, I am in the same situation. Finally you used softmax/argmax in the num_classes dimension? [batch , num_classes, H, W -> [batch , H, W]