How to use AdaptiveLogSoftmaxWithLoss() jointly with CTC loss

galactica147 · January 5, 2022, 2:26am

Hi, I wonder if there is a way to use AdaptiveLogSoftmaxWithLoss() instead of the regular log_softmax() in a LSTM model with CTCLoss().

With log_softmax(), it’s boils down to essentially this:

output = self.LSTM(input_size, lstm_size, num_layers)
output = self.Linear(fc_input, fc_output)
output = F.log_softmax(output, dim=2)   # [batch, seq_length, class]

# to compute CTC loss
ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)

where all the LSTM, linear, and ctcLoss are standard pytorch layers/functions.

This is working all good except that we’d also like to improve model perf when num_classes is big.
Then AdaptiveLogSoftmaxWithLoss() seems to be a great success for regular classification problems.
So i tried something like this:

output = self.LSTM(input_size, lstm_size, num_layers)
output = output.view(batch_size, -1)
output, loss = self.AdaptiveSoftmax(output, labels)

# to compute CTC loss
ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)

This doesn’t seem to work as the output of AdaptiveLogSoftmaxWithLoss() is of shape: (batch_size, n_classes), while ctcLoss() expects the shape to be (batch_size, seq_length, num_classes).

So i wonder what would be the best way to use AdaptiveLogSoftmaxWithLoss() for sequence problems using CTC loss?

galactica147 · January 8, 2022, 1:02am

i guess i could manually compute AdaptiveLogSoftmaxWithLoss for each time step in the sequence and then assemble the results into the original shape. But is there a better way to do so?