3D CNN overfittting issue

Mukesh1729 · November 20, 2021, 1:52pm

Hey Juan,
Sorry for getting back late. I tried a couple of things to make it work but didn’t go well. I tried to re-write the code and used a batch size=16 and a dataset with 1700 training and 400 validation images. The loss looks like below but I find the same pattern of the loss not decreasing after the second epoch(decreases like really small).

My predictions look something like below. They are heatmaps and are quite close to the predictions of the validation set in many cases but the loss like quite bad:

Mukesh1729 · November 22, 2021, 9:33am

Hey Juan,
I figured that the softmax in the final layer was not allowing the loss to go down. I was able to solve it by removing the softmax.

However, I have another question regarding where should I apply the softmax with mse as the loss? In the paper, they mention that the final layer is followed by a softmax function?

JuanFMontesinos · November 24, 2021, 11:20am

Well I think so.
The core idea in behind was to generate probability distribution maps for each predicted node. From that PoV it makes sense to apply a spatial softmax to reach that.

Mukesh1729 · November 24, 2021, 11:37am

How can I integrate the softmax and mse (loss) in my case?