I am trying to build a U-Net Multi-Class Segmentation model for the brain tumor dataset. I implemented the dice loss using nn.module and some guidance from other implementations on the internet. But during my training, my loss is fluctuating and not converging. If I train my model using CrossEntropyLoss it is converging well. When I was debugging with the required_gradient it seems to be False for the output from the loss. I am unable to find out the issue here.
torch.argmax is not differentiable and would thus detach the output from the computation graph.
This should also yield an error such as:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
so I’m not sure if you are rewrapping the loss in a new tensor or why it’s not raised.
I’m not sure where this implementation comes from, but note that other implementations are using softmax to calculate the probabilities for each class.