I would like my model to predict say K number of planes from an RGB image. After outputting a tensor (final layer is softmax) with size [B K+1 H W], with the +1 being the non-planar mask, I summed up all the planar masks. Therefore, after the summing operation, I will have [B 2 H W]. Also, I have ground-truth labels with 1 indicating planar regions and 0 indicating the nonplanar region.
Just wondering which loss function will be the best for this application?
Also, I’ve tried using cross entropy however the loss becomes nan after 2 iterations.