CTCLoss don't work in pytorch

SerB · February 20, 2018, 7:53pm

Hi,

I try to use CTCLoss from https://github.com/SeanNaren/warp-ctc ,
but it always returns loss = inf/nan for large batch size. Maybe someone know how to fix it? Thanks in advance.

tom · February 20, 2018, 8:57pm

Can you post a minimal gist or so to reproduce?
(I.e. precompute outputs and target and just have your ctc application.)
It works for me but acts funny on invalid inputs etc.

Best regards Thomas

SerB · February 20, 2018, 9:10pm

Yes, it small sample which should recognize mnist sequence

import cv2
import numpy as np
import torch
from torch import Tensor
from torch import nn
from torch.autograd import Variable

from warpctc_pytorch import CTCLoss

criterion = CTCLoss()

batch_size = 256
for i in range(10):
    labels = Variable(torch.from_numpy(np.random.randint(0, 10, (batch_size, 2))).int()).view(-1)
    acts = Variable(torch.randn((2, batch_size, 11)), requires_grad=True)
    act_lens = Variable(Tensor([2] * batch_size).int())
    label_lens = Variable(Tensor([2] * batch_size).int())
    loss = criterion(acts, labels, act_lens, label_lens) / batch_size
    loss.backward()
    print("loss: {}".format(loss.data[0]))

tom · February 20, 2018, 10:22pm

So can you please grab an example acts, labels, act_lens and label_lens and make a thing that only has 1 file and just the call to criterion? That would be much quicker to look at. Or you could print the types and shape of these and see if there is anything suspicious.

Best regards

Thomas

SerB · February 21, 2018, 5:38am

Ok, I maked sample is minimalistic which reproduce error.
If batch_size = 1 then it’s works, but if batch_size = 32, for example, I get loss value as inf.

tom · February 21, 2018, 8:05am

With the random label code you posted above, the one problem seems to be that 0 should not be in the labels. If I change the labels to

    labels = Variable(torch.from_numpy(np.random.randint(1, 10, (batch_size, 2))).int()).view(-1)

(note the lower limit of 1 in randint), I get losses of about 5±0.3.
The 0 label is reserved for blank rnn output with the warp_ctc implementation.

Best regards

Thomas

SerB · February 21, 2018, 8:41am

Thank!

It’s strange that 0 label is reserved for blank label instead of, for example max_label + 1 (with max_label as input parameter for loss function).