I train a model with log_softmax activation in the last layer,

then while evaluating the model, I should only print the model(x) to get the real probs, right?

so if it’s that why Keras only takes softmax instead of log softmax and it totally works?

PyTorch uses `log_softmax`

instead of first applying `softmax`

and later `log`

for numerical stability as described in the LogSumExp trick.

If you want to print the probabilities, you could just use `torch.exp`

on the output.

so it means that if i use log_softmax, then i should call exp on the output - exp(out)

otherwise the argmax will return incorrect output for classifying the data ?

If you just want the argmax you can keep the log_softmax and use argmax after that, but if you want the correct softmax probabilities you should use exp(out)

