Hi, I wonder if there is a way to use AdaptiveLogSoftmaxWithLoss() instead of the regular log_softmax() in a LSTM model with CTCLoss().
With log_softmax(), it’s boils down to essentially this:
output = self.LSTM(input_size, lstm_size, num_layers)
output = self.Linear(fc_input, fc_output)
output = F.log_softmax(output, dim=2) # [batch, seq_length, class]
# to compute CTC loss
ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)
where all the LSTM, linear, and ctcLoss are standard pytorch layers/functions.
This is working all good except that we’d also like to improve model perf when num_classes
is big.
Then AdaptiveLogSoftmaxWithLoss()
seems to be a great success for regular classification problems.
So i tried something like this:
output = self.LSTM(input_size, lstm_size, num_layers)
output = output.view(batch_size, -1)
output, loss = self.AdaptiveSoftmax(output, labels)
# to compute CTC loss
ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)
This doesn’t seem to work as the output of AdaptiveLogSoftmaxWithLoss()
is of shape: (batch_size, n_classes), while ctcLoss()
expects the shape to be (batch_size, seq_length, num_classes).
So i wonder what would be the best way to use AdaptiveLogSoftmaxWithLoss()
for sequence problems using CTC loss?