The loss function is a combination of Mean Sqaured error loss and cross-entropy loss.

When i am training my model, there is a finite loss but after some time, the loss is NaN and continues to be so.

When I am training my model just on a single batch of 10 images, the loss is finite most of the times, but sometimes that is also NaN.

Please suggest a possible solution.

Thanks in advance