Do we need to insert blank label into the target before calling the CTC loss?

amankhandelia · October 26, 2018, 11:03am

I have this question about nn.CTCLoss, I would be grateful if anybody can answer the same.

Example:

loss = nn.CTCLoss() 
label = 'aman' 
label_dict = {<blank>:0, 'a':1, 'm':2, 'n':3}

so should encode the label like this
encoded_label = [1, 2, 1, 3]
or like this
encoded_label = [0, 1, 0, 2, 0, 1, 0, 3, 0]
Note: These encoded_label will be passed as targets in the loss mentioned above.

tom · October 26, 2018, 1:09pm

You want encoded_label = torch.tensor([1, 2, 1, 3]), otherwise the CTC loss will tip over. The CTCLoss implementation does follow the alignment with the enlarged sequence, but does this “padding” on the fly.

Best regards

Thomas

SibtainRazaJamali · May 16, 2019, 6:36pm

What if my labels have variable length
label_1 = torch.tensor([1, 2, 1, 3])
labels2=torch.tensor([1, 2, 3])
do i need to pad it with zeros?

gslaller · May 16, 2019, 7:47pm

No, read the documentation, it differs depending on the device you are working. You should initiate all you label with equal length, perhaps by padding the them. The class acknowledges the length of each target through the “target_lengths” argument which should contain the length of each target in the batch.