I am using the
nn.CTCLoss(zero_infinity=True) loss function on
CRNN model. The output from training the model after a few epochs on the validation set is shown below.
The batch size used is 8 and max_length for each sequence is 16. All the different outputs can be better viewed in this colab notebook: Link to colab notebook
The first line contains
target_lengths. The second line contains shape of
targets. From the third line onwards, first string is the originally predicted output, second string is the processed predicted output and third string is the ground truth.
tensor([16, 16, 16, 16, 16, 16, 16, 16], dtype=torch.int32) tensor([5, 4, 1, 5, 5, 4, 5, 1]) torch.Size([16, 8, 41]) torch.Size([8, 16]) eeeeeeeeeeeee222 e2 22.44___________ eeeeeeeeeeeee222 e2 8.70____________ eeeeeeeeeeeee222 e2 0_______________ eeeeeeeeeeeee222 e2 12.90___________ eeeeeeeeeeeee222 e2 15.80___________ eeeeeeeeeeeee222 e2 2.80____________ eeeeeeeeeeeee222 e2 11.50___________ eeeeeeeeeeeee222 e2 4_______________
The problem can be either the arguments passed to the CTC loss function are wrong or something else entirely.