Dear @ptrblck sorry for asking so many questions.
I still have trouble understanding this. The input of the network, for a binary class, is a [1, w, h] image. The prediction is [1, 2, w, h] (assuming the batch size is one), and the mask(ground truth) is [1, w, h]. I have to main questions:
- How is the loss computed? the loss should get each cell’s value (which 0 or 1), and compare it to the mask values(again 0 or 1).
- How to get the final predicted mask, in terms of an image? the output is class probability, then how am I suppose to get 0 or 1?