# How do i get the ultimate output tensor in a semantic segmention task

I’m practicing a semantic segmentation task.
As there are two class for each pixel, the output of the model is of shape(1, 2, 512, 512). See there are two channels, but i need to convert it into shape(1, 1, 512, 512), to match the shape of my label. Wondering what method should i take to complete this transformation?

Hello,

I think shape of output of the network is depending on which loss function employed.
In binary classification, you can use both `CELoss` and `BCELoss`.
For `CELoss`, the shape of the output should be `[batch_size, nb_classes, H, W]` then produce a probability map by `softmax`.
For `BCELoss`, the shape should be `[batch_size, H, W]` which is the same to label, and it should incorperate with `sigmoid`.

Note: `CELoss` has contain `logsoftmax`, so you could only pass the model output and label.
`BCELossWithLogits` = `BCELoss`+`sigmoid`
for more details

How to get the ultimate output tensor ,i.e. (1, 2, 512, 512) to transform into the ground truth shape (1, 1, 512, 512)? I know how to use CELoss and `BCELoss`

Have you solved the problem?

Hi,
@Mellow

This line worked for me.

``````
_, pred = torch.max(scores, dim=1)
``````

https://pytorch.org/docs/stable/torch.html#torch.max

Where `scores` is your ultimate tensor containing probs.

For instance, `scores` tensor has size of `[10, 150, 256, 256]` which means I have 150 classes to segment and using above code give me `[10, 256, 256]` tensor. Then you can `.unsqueeze(1)` to get your desired dimension.

PS: `torch.max` return a tuple which the second value is the target tensor.

1 Like

thank you very much!