CTC Loss function not working with CUDA when using torch.int32

amankhandelia · November 21, 2018, 6:18am

As per the docs

In order to use CuDNN, the following must be satisfied: targets must be in concatenated format

and

the integer arguments must be of dtype torch.int32

So I modified the sample code on the docs(referenced above) page to satisfy the above mentioned conditions

ctc_loss = torch.nn.CTCLoss()
log_probs = torch.randn(50, 16, 20).log_softmax(2).detach().requires_grad_().cuda()
target_lengths = torch.randint(10,30,(16,), dtype=torch.int32).cuda()
targets = torch.randint(1, 20, (1, target_lengths.sum()), dtype=torch.int32).view(-1).cuda()
input_lengths = torch.full((16,), 50, dtype=torch.int32).cuda()
loss = ctc_loss(log_probs, targets, input_lengths, target_lengths)
loss.backward()

But this did not went well and the code threw this error.

RuntimeError Traceback (most recent call last)
in
5 input_lengths = torch.full((16,), 50, dtype=torch.int32).cuda()
6
----> 7 loss = ctc_loss(log_probs, targets, input_lengths, target_lengths)
8 loss.backward()

~/Documents/projects/rosetta/venv/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
→ 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)

~/Documents/projects/rosetta/venv/lib/python3.6/site-packages/torch/nn/modules/loss.py in forward(self, log_probs, targets, input_lengths, target_lengths)
1191
1192 def forward(self, log_probs, targets, input_lengths, target_lengths):
→ 1193 return F.ctc_loss(log_probs, targets, input_lengths, target_lengths, self.blank, self.reduction)
1194
1195 # TODO: L1HingeEmbeddingCriterion

~/Documents/projects/rosetta/venv/lib/python3.6/site-packages/torch/nn/functional.py in ctc_loss(log_probs, targets, input_lengths, target_lengths, blank, reduction)
1470 >>> loss.backward()
1471 “”"
→ 1472 return torch.ctc_loss(log_probs, targets, input_lengths, target_lengths, blank, _Reduction.get_enum(reduction))
1473
1474

RuntimeError: Expected tensor to have CPU Backend, but got tensor with CUDA Backend (while checking arguments for cudnn_ctc_loss)

So I modified the datatype to torch.long (which was more of a hunch than an informed decision).

ctc_loss = torch.nn.CTCLoss()
log_probs = torch.randn(50, 16, 20).log_softmax(2).detach().requires_grad_().cuda()
target_lengths = torch.randint(10,30,(16,), dtype=torch.long).cuda()
targets = torch.randint(1, 20, (1, target_lengths.sum()), dtype=torch.long).view(-1).cuda()
input_lengths = torch.full((16,), 50, dtype=torch.long).cuda()
loss = ctc_loss(log_probs, targets, input_lengths, target_lengths)
loss.backward()

And this worked effortlessly with the GPU. So is it that I have misunderstood something from the docs or something is missing in the docs?

@tom

tom · November 21, 2018, 10:53am

Unfortunately, the error doesn’t tell you which tensor should have a CPU backend.
cudnn wants the targets on the CPU (and I’d think the lengths, too).
I your second example probably uses the native implementation, but there, too, it is preferable to keep the lengths on the CPU (or else they’ll be moved to CPU and you have a sync point). You can tell which backend is used by printing loss.grad_fn and checking whether it includes “cudnn” in the grad_fn’s name.

Best regards

Thomas

jun_zhou · December 18, 2018, 4:07am

I try to use a cudnn backhand ctc loss, according to the doc，I write the follow script

# -*- coding: utf-8 -*-
import torch
from torch import nn

class Net(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Net, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=(3, 3), padding=(1, 1)),
            nn.ReLU(),
            nn.BatchNorm2d(out_channels),
            nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2)),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.squeeze(dim=2)
        x = x.permute(2, 0, 1)
        return x


if __name__ == '__main__':
    device = torch.device('cuda:0')
    a = torch.zeros((2, 3, 3, 160)).to(device)
    net = Net(3, 10)
    net.to(device)
    log_probs = net(a).log_softmax(2)
    target_lengths = [30, 20]
    targets = torch.randint(1, 10, (sum(target_lengths),), dtype=torch.int)
    target_lengths = torch.Tensor(target_lengths).to(torch.int)
    input_lengths = torch.Tensor([80, 80]).to(torch.int)
    ctc = torch.nn.CTCLoss()
    res = ctc(log_probs, targets, input_lengths, target_lengths)
    print('log_probs: {}, type: {}'.format(log_probs.size(), log_probs.dtype))
    print('input_lengths: {}, type: {}'.format(input_lengths.size(), input_lengths.dtype))
    print('targets: {}, type: {}'.format(targets.size(), targets.dtype))
    print('target_lengths: {}, type: {}'.format(target_lengths.size(), target_lengths.dtype))
    print('res.grad_fn:', res.grad_fn)
    print('res.grad_fn:', res.grad_fn.name())

the output is

D:\Anaconda3\python.exe E:/work/flask_learn/test.py
log_probs: torch.Size([80, 2, 10]), type: torch.float32
input_lengths: torch.Size([2]), type: torch.int32
targets: torch.Size([50]), type: torch.int32
target_lengths: torch.Size([2]), type: torch.int32
res.grad_fn: <MeanBackward1 object at 0x0000025E35A2CF98>
res.grad_fn: MeanBackward1

Process finished with exit code 0

loss.grad_fn’s name not includes “cudnn”
Is there any problem with the code?