I’ve trained a image segmentation model using BCE loss and Adam optimizer with no weight decay. No dropout is used in the network. In the training loss curve there is spike after some steps.
What are the possible reasons for this?
I’ve trained a image segmentation model using BCE loss and Adam optimizer with no weight decay. No dropout is used in the network. In the training loss curve there is spike after some steps.
What are the possible reasons for this?