# Loss nll - formula in documentation

Greetings,

I am a bit confused by the documented formula for the negative log-likelihood loss:

What are those x_(n,y_n) ? They say x is the input, but the loss is not calculated from the input. A loss should be calculated from the output and the target, should it not?

Best,
PiF

edit: is the following assumption correct?
x_(n,y_n) is the entry of the network output vector that corresponds to the probability of the target y_n, where n denotes the sample.

Hi Physics!

Yes, the terminology in the documentation is somewhat unfortunate.

The argument to the loss functions that pytorch refers to as `input` is
indeed the â€śoutputâ€ť of the network (or derived from it). I prefer to call
this the â€śprediction.â€ť The loss function then compares the â€śpredictionâ€ť
with the â€śground truthâ€ť label, which is the argument that pytorch calls
the `target`.

i guess one can think of the â€śpredictionâ€ť being the â€śinputâ€ť to the loss
function (but so is the `target`).

So â€śoutputâ€ť of model = â€śpredictionâ€ť = â€śinputâ€ť to loss function. But using
the term `input` in the loss-function documentation has always seemed
a bit confusing to me, as it suggests to me the input to the model, rather
than the output of the model. But Iâ€™ve made my peace with the pytorch
terminology.

Not quite. What you denote â€śx_(,y_n)â€ť is the predicted log-probability
of the sample corresponding to target y_n (rather than the probability).

So, typically, the output of the final `Linear` layer in the model are the
predicted raw-score logits. When passed through `LogSoftmax`, you
get the predicted log-probabilities that you then pass into `NLLLoss`.
(If you pass the logits through `Softmax`, you get probabilities, but itâ€™s
numerically more stable to work with the log-probabilities.)

(For convenience, pytorchâ€™s `CrossEntropyLoss` combines `LogSoftmax`
together with `NLLLoss` so you can pass in the logits directly without
passing them through a separate `LogSoftmax` function.)

Best.

K. Frank

1 Like

Thank you very much K. Frank, for this detailed explanation!
I guess calling the network output â€śinputâ€ť makes sense from a programmerâ€™s perspective, since it is an input to a function.

It is all clear now.
PiF