I want to know if it is the same for the following both train ways:
- 3-class channels for the mask (0 or 1 for each channel):
- sigmoid output → bceLoss (train)
- sigmoid output → ge(0.5) → dice (val)
- one-hot for the mask:
- softmax output → cross-entropy loss (train)
- softmax output → argmax → dice (val)
Did they cause a large influence on the model training?
Thanks in advance.