As far as I know, for training you need log_softmax. For inference you can just do argmax. But using argmax might only give you Top-1 accuracy. If you use softmax and get top 5 scores you can get Top-5 accuracy. I think DeepSpeech model does something similar.
So for the training I need to use log_softmax it’s clear now. For the inference I can use softmax to get top k scores.
What isn’t clear is that why DeepSpeech implementation is not using log_softmax in the repo? I suppose there should be an explicit call of log_softmax in the model definition or the model calling, right? Or did I miss something?