Nll loss only calculating error for one of two output nodes

pytorcher · January 22, 2020, 3:44am

Not sure if I just dont understand nll loss correctly, but I am running super simple network that just has one layer with two conv nodes taking in a 2x2 input which are either 0’s or 1’s, and then a sigmoid nonliniarity. I have assigned the layer a register_backward_hook to a function that is just printing out the values for grad_input and grad_output.
When I run this with CrossEntropyLoss grad_output will have values for both nodes e.g.:

grad_output
(tensor([[[[ 0.1271]],
[[-0.1269]]]], device=‘cuda:0’),)

but when I run with NLLLoss only one of the nodes will have a value, which seems to be the target node. e.g.:

grad_output
(tensor([[[[ 0.0000]],
[[-0.2495]]]], device=‘cuda:0’),)

Is this expected behavior, or am I possibly doing something incorrectly. Rest of the code is identical besides switching the criterion function before calling loss = criterion(outputs, targets) which in both cases looks like:

targets
tensor([1], device=‘cuda:0’)
outputs
tensor([[0.5123, 0.4775]], device=‘cuda:0’, grad_fn=<ViewBackward)

ptrblck · January 22, 2020, 6:02am

Note that nn.CrossEntropyLoss expects raw logits, which can take any value, while nn.NLLLoss expects log probabilities as the input.
Internally in nn.CrossEntropyLoss, F.log_softmax and nn.NLLLoss will be applied.

I’m not sure how you’ve calculated your output, but it doesn’t seem to contain log probabilities.