Do we need to insert blank label into the target before calling the CTC loss?

I have this question about nn.CTCLoss, I would be grateful if anybody can answer the same.

Example:

loss = nn.CTCLoss() 
label = 'aman' 
label_dict = {<blank>:0, 'a':1, 'm':2, 'n':3}

so should encode the label like this
encoded_label = [1, 2, 1, 3]
or like this
encoded_label = [0, 1, 0, 2, 0, 1, 0, 3, 0]
Note: These encoded_label will be passed as targets in the loss mentioned above.

You want encoded_label = torch.tensor([1, 2, 1, 3]), otherwise the CTC loss will tip over. The CTCLoss implementation does follow the alignment with the enlarged sequence, but does this “padding” on the fly.

Best regards

Thomas

2 Likes

What if my labels have variable length
label_1 = torch.tensor([1, 2, 1, 3])
labels2=torch.tensor([1, 2, 3])
do i need to pad it with zeros?

No, read the documentation, it differs depending on the device you are working. You should initiate all you label with equal length, perhaps by padding the them. The class acknowledges the length of each target through the “target_lengths” argument which should contain the length of each target in the batch.