How does nn.CrossEntropyLoss aggregate the loss?

Your reductions don’t seem to use the passed weight tensor.
Have a look at this post and let me know, if this would solve the issue.