CTCLoss error using SeanNaren/warp-ctc?

BigBorg · March 15, 2018, 7:54am

I got no error compiling and installing warp-ctc pytorch binding. I followed the installation guidance in warp-ctc pytorch_binding. The only step I skiped was setting CUDA_HOME because I don’t have gpu. But when I run the python test script, It says “AttributeError: module ‘warpctc_pytorch’ has no attribute ‘cpu_ctc’”. Printed traceback:

Traceback (most recent call last):
  File "test.py", line 59, in test_simple
    cpu_cost, cpu_grad = self._run_grads(label_sizes, labels, probs, sizes)
  File "test.py", line 22, in _run_grads
    cost = ctc_loss(probs, labels, sizes, label_sizes)
  File "/home/borg/anaconda2/envs/neural/lib/python3.6/site-packages/torch/nn/modules/module.py", line 363, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/borg/anaconda2/envs/neural/lib/python3.6/site-packages/warpctc_pytorch-0.1-py3.6-linux-x86_64.egg/warpctc_pytorch/__init__.py", line 76, in forward
    self.length_average)
  File "/home/borg/anaconda2/envs/neural/lib/python3.6/site-packages/warpctc_pytorch-0.1-py3.6-linux-x86_64.egg/warpctc_pytorch/__init__.py", line 17, in forward
    loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc
AttributeError: module 'warpctc_pytorch' has no attribute 'cpu_ctc'

Any idea what’s wrong @SeanNaren ?

jpeg729 · March 15, 2018, 8:53pm

From the github readme I am guessing that this should work…

loss_func = warp_ctc.CTCLoss()

BigBorg · March 16, 2018, 2:22am

It doesn’t work. Still says “module ‘warpctc_pytorch’ has no attribute ‘cpu_ctc’”.

bornfree · March 16, 2018, 11:45am

Pull request https://github.com/SeanNaren/warp-ctc/pull/21 seems to have broken the process. I was stuck here for the past 2 hours too.

In warp-ctc checkout to a commit before that https://github.com/SeanNaren/warp-ctc/commit/432cf33910f1717b01f90d2662b575ab8de8fb6d.
git checkout 432cf33910f1717b01f90d2662b575ab8de8fb6d and build again. Should work.

BigBorg · March 19, 2018, 6:28am

Thanks, this makes loss function running. But it returns Inf. I think this may be caused by my usage, not sure if I am passing correct parameters.

loss = loss_function(tag_score, target, tag_score_sizes, target_sizes)

shapes of parameters are as follows:

tag_score: (4, 2, 64)        seq_len, batch_size, character_probability
target: (8, )                      target labels stretched as 1D tensor, actual data: [5,12,58,62,10,52,50,15]
tag_score_sizes:  (2,)      actual data is [4, 4]
target_sizes: (2, )            actual data is [4, 4]

I have checked tag_score by summing up the last dimension. The results are 1 denoting they are probabilities.

tom · March 19, 2018, 6:47am

Two things I had to adapt to are

warpctc wants packed sequences
it is one of the few places where you need IntTensors (.int() so dtype int32) rather than LongTensors.

I packed the sequences myself, the format is not that of pack padded sequences for cuDNN.

Best regards

Thomas

BigBorg · March 20, 2018, 4:03am

Thanks, converting LongTensor to IntTensor works. The model is traning now. However, the training loss varys much and does not decrease. My input to the loss function:

tag_score_shape is [15,2,64] (max_seq_len, batch_size, character_probability). It’s generated by passing neural outcome to sigmoid and then softmax. My neural network is structed as resnet_cnn layers + bidirectional lstm + fully connected. It works well for fixed length character recognization using NLLLoss. The traning and evaluation loss:

They seems to be too large for loss. The evaluation loss curve seems to be declining, but the values are all 256.5 probably due to precision omitted by tensorboard. After some time, eval loss just stays still. The traning loss looks bad too.

longnagai · March 21, 2018, 3:24pm

I have a same problem.
But I couldn’t understand how did you solve the problem.
Could you please explain in detail?

I have an error saying that “AttributeError: module ‘warpctc_pytorch’ has no attribute ‘cpu_ctc’” when running the python test script which is shown in the installation guide of warp-ctc/pytorch_binding.
I would like to know what you did for the last comment which says that you converted LongTensor to IntTensor, because I couldn’t find any LongTensors.

BigBorg · March 22, 2018, 1:43am

As bornfree Harsha said above, the pull request 21 broke the process. Checkout an earlier version to install.

My target, tag_score_sizes, target_sizes was Variable of LongTensor, that caused problem when running. They needs to be Variables of IntTensor. For the last comment, I actually forgot to remove log_softmax in an earlier version without CTC and feed them directly into sigmoid and softmax. Now I put CNN + RNN + FC output directly to CTC. This makes result better than the last comment I posted. But it still doesn
t converge. You could refer to deepspeech for CTC usage. That repo using CTC actually works. Still trying to find out how to make my model converge.

longnagai · March 22, 2018, 8:09pm

I just solved the problem.
Thank you so much, BigBorg!
Thank you also to you, bornfree Harsha!

BigBorg · March 23, 2018, 12:10pm

Is your model converging? I can’t get my model converging with CTC. I’m using CNN + RNN + FC + CTC loss to do variable length captcha recognition. Any advice?