Very Low CTCLoss for incorrect prediction

Hari_Krishnan · February 5, 2022, 1:18pm

After completing training with a very low CTC Loss, the OCR model was performing very poorly. The following code shows on how I calculated the loss and how I structured the input

for data in test_dataloader:
    loss_criterion = nn.CTCLoss()
    target_lengths = data['lengths']
    N = data['pixel_values'].size()[0]
    data.pop("lengths")
    pred = model(**data)
    pred = pred['logits'].permute(1,0,2)
    pred = nn.functional.log_softmax(pred, dim = 2)
    input_lengths = torch.full(size=(N,), fill_value=128, dtype=torch.long)
    custom_loss = loss_criterion(pred, data['labels'].long(), input_lengths, target_lengths)
    pred = pred.max(1)
    print("Loss",custom_loss.item())
    print(input_lengths, target_lengths)
    print(pred[1][0][:10])
    print(data['labels'][0][:10])
    break

The output for the above code was

Loss 2.682149897736963e-05
tensor([128]) tensor([4])
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0')
tensor([    2, 16869, 11603,     3,     0,     0,     0,     0,     0,     0],
       device='cuda:0')

You can see that the expected output and predicted output are completely different, but the loss is extremely low. The lengths of the output are padded to length 128 as you can see it. The loss calculation is done the same way as it is in the documentation . What could be the reason for this?