3D Unet Implementation doesn't overfit

I assume you are using nn.BCELoss as your criterion.
If so, could you remove the last sigmoid and use nn.BCEWithLogitsLoss?

Let me know, if this helps in any sense or if you are still seeing this behavior.