Cross Entropy loss is not decreasing

You shouldn’t pass the softmax into the CrossEntropy loss. It computes log_softmax(y2) internally, so you end up with with log_softmax(softmax(z)), which would make for a pretty awkward gradient. That was actually frequent issue among my students so I made a kidn of cheatsheet for them: Why are there so many ways to compute the Cross Entropy Loss in PyTorch and how do they differ?

@ptrblck, I did try removing the softmax as well. In this case the error decreases but the accuracy does not improve.

That’s good. Now that the loss decreases, the next step is to find out why the accuracy doesn’t increase. Can you show the function that computes the accuracy?

4 Likes