How do i get the ultimate output tensor in a semantic segmention task

Mellow · April 4, 2019, 12:16pm

I’m practicing a semantic segmentation task.
As there are two class for each pixel, the output of the model is of shape(1, 2, 512, 512). See there are two channels, but i need to convert it into shape(1, 1, 512, 512), to match the shape of my label. Wondering what method should i take to complete this transformation?

MariosOreo · April 4, 2019, 1:46pm

Hello,

I think shape of output of the network is depending on which loss function employed.
In binary classification, you can use both CELoss and BCELoss.
For CELoss, the shape of the output should be [batch_size, nb_classes, H, W] then produce a probability map by softmax.
For BCELoss, the shape should be [batch_size, H, W] which is the same to label, and it should incorperate with sigmoid.

Note: CELoss has contain logsoftmax, so you could only pass the model output and label.
BCELossWithLogits = BCELoss+sigmoid
for more details

XiaoAHeng · July 24, 2019, 2:50am

How to get the ultimate output tensor ,i.e. (1, 2, 512, 512) to transform into the ground truth shape (1, 1, 512, 512)? I know how to use CELoss and BCELoss

XiaoAHeng · July 24, 2019, 2:51am

Have you solved the problem?

Nikronic · July 24, 2019, 8:01pm

Hi,
@Mellow

This line worked for me.


_, pred = torch.max(scores, dim=1)

https://pytorch.org/docs/stable/torch.html#torch.max

Where scores is your ultimate tensor containing probs.

For instance, scores tensor has size of [10, 150, 256, 256] which means I have 150 classes to segment and using above code give me [10, 256, 256] tensor. Then you can .unsqueeze(1) to get your desired dimension.

PS: torch.max return a tuple which the second value is the target tensor.

XiaoAHeng · July 25, 2019, 2:29am

thank you very much!