Semantic segmentation loss function questions

KKKcat · May 8, 2021, 2:34am

Hi there, I’m new to segmentation model.
I would like to use the deeplabv3_resnet50 model.
My image has shape (256, 256, 3) and my label has shape (256, 256). Each pixel in my label has a class value(0-4). And the batch size set in the DataLoader is 32.
The shape of my input batch is [32, 3, 256, 256] and the shape of corresponding target is [32, 256, 256]. I believe this is correct.
I was trying to use nn.BCEWithLogitsLoss().

Is this the correct loss function for my case? Or should I use CrossEntropy instead?
If this is the right one, the output of my model is [32, 5, 256, 256]. Each image prediction has the shape [5, 256, 256], does layer 0 means the unnomarlized probabilities of class 0? In order to make a [32, 256, 256] tensor to match the target to feed into the BCEWithLogitsLoss, do I need to transform the unnomarlized probabilities to classes?

Thank you everyone.