How to specifically format the labels in pytorch 1.0 to use CTCLoss.
The way it is described in the original paper is to have one extra blank class + token class
when doing a per character classification,
so if my original label is “CAAT”
and the index is C:1 , A:2, T:3
for a normal cross entropy loss the label would have been [1,2, 2,3]
what should it be for CTC loss, will the label do if I just blindly add the blank token after every token in original label :
To Predict per char → “CAAT”
Token map → blank:0, C:1 , A:2, T:3
label → [1,0, 2,0, 2,0, 3, 0]
is this correct ?
Also for using CUDNN pytorch mentions one needs to be in “concatenated form”
How so ? concat in which dimension ? and what would be the target lengths in that case ?