Pytorch equivalence to sparse softmax cross entropy with logits in TensorFlow

ptrblck · May 27, 2018, 7:33pm

One side note:
nn.NLLLoss should be used with nn.LogSoftmax, not nn.Softmax directly.
So basically a logic output and nn.CrossEntropyLoss or a nn.LogSoftmax output with nn.NLLLoss yield identical losses:

m = nn.LogSoftmax(dim=1)

criterion1 = nn.CrossEntropyLoss()
criterion2 = nn.NLLLoss()

x = torch.randn(1, 5)
y = torch.empty(1, dtype=torch.long).random_(5)

loss1 = criterion1(x, y)
loss2 = criterion2(m(x), y)
print(loss1)
print(loss2)