According to the docs:
As with
NLLLoss
, the input given is expected to contain log-probabilities and is not restricted to a 2D Tensor. The targets are given as probabilities (i.e. without taking the logarithm).
your code snippet looks alright. I would recommend to use log_softmax
instead of softmax().log()
, as the former approach is numerically more stable.