Question about output and label channels in semantic segmentation

Vishrut10 · June 22, 2019, 9:01pm

Yes, I am trying to debug it right now. I have done this before with TF, but want to move to Pytorch and hence the effort

ptrblck · June 22, 2019, 9:05pm

Ok, let me know, if you get stuck somewhere.

Vishrut10 · June 22, 2019, 9:12pm

Ok, I feel batch is causing the problem, my batch size is 3 and hence getting [3, 512, 512] as my dimension

ptrblck · June 22, 2019, 9:24pm

In that case everything seems to work!
Sorry for missing this point, as I thought this shape refers to a single mask.

Do you get any errors, since your code looks alright then.

Vishrut10 · June 22, 2019, 9:50pm

I do not get any errors, if you feel this is fine then important question is how do I interpret the output which is in the shape
[batch_size, nb_classes, height, width] which in my case for 1 image is [1, 12, 512, 512]. And convert this back to image with segmentation
predictions[‘out’]
Out[76]:
tensor([[[[-0.0465, -0.0465, -0.0465, …, -0.0556, -0.0556, -0.0556],
[-0.0465, -0.0465, -0.0465, …, -0.0556, -0.0556, -0.0556],
[-0.0465, -0.0465, -0.0465, …, -0.0556, -0.0556, -0.0556],
…,

ptrblck · June 22, 2019, 9:52pm

You see the logits in each channel corresponding to each class, i.e. channel0 gives the logits for class0, etc.
If you would like to get the predictions (as class indices), you could use:

preds = torch.argmax(predictions['out']), 1)

and could then visualize the predictions similar to your target.

Vishrut10 · June 22, 2019, 9:58pm

Brilliant! Thanks a lot, let me try this out.

geekswaroop · April 25, 2020, 8:55pm

Does the value 2-9 represent the 12 classes in the labels?

Also, why is the label having 3 channels?