hi, I found that the gradients of nn.CTCLoss is the objective function derivatives with respect to the unnormalised outputs u(t,k), why is not the objective function derivatives with respect to the input(log probs obtained with log_softmax)? Does it compute gradients again when backpropagate through softmax layer?

thank you!