Is target tensor converted to one hot encoding in NLLLoss?

Hi, I’m a new learner and trying to implement a NLLLoss using Numpy. Now for an input tensor of size (minibatch,C) it’s quite simple, I can achieve it either by first converting target tensor to one hot encoding matrix or simply use indexing.

However, when input tensor is (minibatch,C,d1​,d2​,…,dK​) for the K-dimensional case, converting target to one-hot encoder seems not memory efficient.

So I’d like to ask how does NLLLoss achieve this? is target tensor firstly converted to one hot encoder tensor, then element wise multiplies with input tensor, then sums over axis C (class axis), Or simply use some sort of advanced indexing to save memory? Thanks for any help!

The target tensor in nn.NLLLoss and nn.CrossEntropyLoss (which calls F.nll_loss internally) uses class indices and direct indexing and won’t be converted to a one-hot version of it.

Cool thanks for your reply! Is this doable in plain Numpy? I can’t seem to find the source code, it ends in torch._C._nn.nll_loss2d.

or if there are any doc that details the implementation? Thank you!

The docs show the used formula and the C++ implementation can be found here and their CUDA counterpart here.