CTC loss error after first epoch

I am training a character recognizer based on the Libri-speech dataset.

The input is the Mel-spectrogram shape --> tensor of size [10,1,512, 2000]
The labels shape --> tensor of size [10, 300]
The output shape --> tensor of size [1000, 10, 29]
input_lengths = [1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000]
target_lengths = [300, 300, 300, 300, 300, 300, 300, 300, 300, 300]

Following a snapshot of the code

criterion_back = torch.nn.CTCLoss(blank=28)
output back_model(enhanced)
output = F.log_softmax(output, dim=2)
output = output.transpose(0, 1)
back_loss = criterion_back(output, label, input_lengths, target_lengths)

The problem is that at the beginning of the second epoch. I got this error:
Hint: The first epoch was done smoothly without any error this error happens when it starts the second epoch.

File "train.py", line 95, in <module>
    back_loss = criterion_back(output, label, input_length, target_length)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 1714, in forward
    return F.ctc_loss(log_probs, targets, input_lengths, target_lengths, self.blank, self.reduction,
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 2559, in ctc_loss
    return torch.ctc_loss(
RuntimeError: input_lengths must be of size batch_size

Hi @Mohamed_Nabih, according to the CTCLoss documentation, the shape of output is [time, batch, num_class]. In your case, the shape of output is [1000, 10, 29]. 1000 is the number of frame and 10 is the batch size, so you don’t need to apply output = output.transpose(0, 1) after log_softmax.

Can you try with removing the transpose line?