Hi, I wonder if there is a way to use AdaptiveLogSoftmaxWithLoss() instead of the regular log_softmax() in a LSTM model with CTCLoss().
With log_softmax(), it’s boils down to essentially this:
output = self.LSTM(input_size, lstm_size, num_layers) output = self.Linear(fc_input, fc_output) output = F.log_softmax(output, dim=2) # [batch, seq_length, class] # to compute CTC loss ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)
where all the LSTM, linear, and ctcLoss are standard pytorch layers/functions.
This is working all good except that we’d also like to improve model perf when
num_classes is big.
AdaptiveLogSoftmaxWithLoss() seems to be a great success for regular classification problems.
So i tried something like this:
output = self.LSTM(input_size, lstm_size, num_layers) output = output.view(batch_size, -1) output, loss = self.AdaptiveSoftmax(output, labels) # to compute CTC loss ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)
This doesn’t seem to work as the output of
AdaptiveLogSoftmaxWithLoss() is of shape: (batch_size, n_classes), while
ctcLoss() expects the shape to be (batch_size, seq_length, num_classes).
So i wonder what would be the best way to use
AdaptiveLogSoftmaxWithLoss() for sequence problems using CTC loss?