I have read several blog articles to get an idea of how CTCLoss works algorithmically, and the PyTorch documentation seems straightforward. All the examples I have seen online conform to my understanding, but I am having trouble getting it to work in practice. Here’s a minimal working example where the losses should be close to zero, because the inputs match the targets.
The unreduced loss when I run this is
tensor([5.9605e-07, inf]) instead of both terms being close to zero. An explanation of why the second loss is infinite, and how to correctly use CTCLoss so that both terms are close to zero would be appreciated.
import torch e = 1e-7 targets = torch.LongTensor([ [1, 2, 1], [1, 1, 1] ]) logprobs = torch.Tensor([ [[e, 1-2*e, e], [e, e, 1-2*e], [e, 1-2*e, e], [1-2*e, e, e]], [[e, 1-2*e, e], [e, 1-2*e, e], [e, 1-2*e, e], [1-2*e, e, e]] ]) # easier to enter in shape (N=2, T=4, C=2+1) logprobs = torch.log(logprobs) logprobs = torch.transpose(logprobs, 0, 1) # get to correct shape (T, N, C) input_lengths = torch.LongTensor([4,4]) target_lengths = torch.LongTensor([3,3]) loss = torch.nn.CTCLoss(blank = 0, reduction='none') loss(logprobs, targets, input_lengths, target_lengths)