Hello,
I’m struggling while trying to implement this paper. After some epochs the loss stops going down but my network only produces blanks. I’ve seen a lot of posts on the forum concerning this issue and most of the time the problem resulted from a wrong understanding of the way CTCLoss works. So I tried to make a minimal example to see where my code went wrong.
I was expecting it to be zero as the prediction matches the target perfectly so if someone could explain why the loss can never be zero or where my mistake is I would really appreciate it
I actually have a hard time understanding why log should be called before log_softmax because as the description of CTCLoss states :
Blockquote
Log_probs: Tensor of size (T, N, C)… The logarithmized probabilities of the outputs (e.g. obtained with torch.nn.functional.log_softmax()).
So I was expecting the call to log_softmax to do the log on its own (as described in the doc).
Either way I tried calling .log() on my actual outputs and not this small example and the training produced the same error (predicting only blanks) so I’m back to square one.
Thank you for your answer @SimonW.
It means that my training problem does not come from the way I use CTCLoss then as I think I used it right in the first place according to your description. The inputs of the log_softmax came from my last nn.Linear layer.
In the model I have (only kept the relevant parts):