In NLLLoss for multiple dimensions, i see that the log probs tensor has to be arranged like (NxCxd1xd2…)
and the target as (Nxd1xd2…). Why is this required?
Why cannot Nxd1xd2…dkxC and Nxd1xd2…dkx1[on unsqueezing in last dimension] figure out the way to calculate the loss itself [as only 1 dimension is different]