3D CNN overfittting issue

Hey Juan,
Sorry for getting back late. I tried a couple of things to make it work but didn’t go well. I tried to re-write the code and used a batch size=16 and a dataset with 1700 training and 400 validation images. The loss looks like below but I find the same pattern of the loss not decreasing after the second epoch(decreases like really small).
image

My predictions look something like below. They are heatmaps and are quite close to the predictions of the validation set in many cases but the loss like quite bad:

Hey Juan,
I figured that the softmax in the final layer was not allowing the loss to go down. I was able to solve it by removing the softmax.

However, I have another question regarding where should I apply the softmax with mse as the loss? In the paper, they mention that the final layer is followed by a softmax function?

Well I think so.
The core idea in behind was to generate probability distribution maps for each predicted node. From that PoV it makes sense to apply a spatial softmax to reach that.

How can I integrate the softmax and mse (loss) in my case?