Hi, I wonder if there is a way to use AdaptiveLogSoftmaxWithLoss() instead of the regular log_softmax() in a LSTM model with CTCLoss().

With log_softmax(), it’s boils down to essentially this:

```
output = self.LSTM(input_size, lstm_size, num_layers)
output = self.Linear(fc_input, fc_output)
output = F.log_softmax(output, dim=2) # [batch, seq_length, class]
# to compute CTC loss
ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)
```

where all the LSTM, linear, and ctcLoss are standard pytorch layers/functions.

This is working all good except that we’d also like to improve model perf when `num_classes`

is big.

Then `AdaptiveLogSoftmaxWithLoss()`

seems to be a great success for regular classification problems.

So i tried something like this:

```
output = self.LSTM(input_size, lstm_size, num_layers)
output = output.view(batch_size, -1)
output, loss = self.AdaptiveSoftmax(output, labels)
# to compute CTC loss
ctc_loss = self.ctcLoss(output, labels, original_input_lengths, original_label_lengths)
```

This doesn’t seem to work as the output of `AdaptiveLogSoftmaxWithLoss()`

is of shape: (batch_size, n_classes), while `ctcLoss()`

expects the shape to be (batch_size, seq_length, num_classes).

So i wonder what would be the best way to use `AdaptiveLogSoftmaxWithLoss()`

for sequence problems using CTC loss?