Hello,

I’m struggling while trying to implement this paper. After some epochs the loss stops going down but my network only produces blanks. I’ve seen a lot of posts on the forum concerning this issue and most of the time the problem resulted from a wrong understanding of the way CTCLoss works. So I tried to make a minimal example to see where my code went wrong.

```
import torch
import torch.nn as nn
nn.CTCLoss(blank=0, zero_infinity=True, reduction=none)
predicted = torch.tensor(
[
[
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.]
]
]
).detach().requires_grad_()
expected = torch.tensor([
[1.,3.]
], dtype=torch.long)
predicted_lengths = torch.tensor(predicted.shape[0] * [predicted.shape[1]])
ctc_loss(predicted.permute(1, 0, 2).log_softmax(2), expected, predicted_lengths, torch.tensor([2]))
```

and it returns

```
tensor([1.8097], grad_fn=<SWhereBackward>)
```

I was expecting it to be zero as the prediction matches the target perfectly so if someone could explain why the loss can never be zero or where my mistake is I would really appreciate it

Thanks