Not sure if I just dont understand nll loss correctly, but I am running super simple network that just has one layer with two conv nodes taking in a 2x2 input which are either 0’s or 1’s, and then a sigmoid nonliniarity. I have assigned the layer a register_backward_hook to a function that is just printing out the values for grad_input and grad_output.
When I run this with CrossEntropyLoss grad_output will have values for both nodes e.g.:
but when I run with NLLLoss only one of the nodes will have a value, which seems to be the target node. e.g.:
Is this expected behavior, or am I possibly doing something incorrectly. Rest of the code is identical besides switching the criterion function before calling loss = criterion(outputs, targets) which in both cases looks like:
tensor([[0.5123, 0.4775]], device=‘cuda:0’, grad_fn=<ViewBackward)