Great solution! I have a similar situation as the original poster. I mean I could have write my own implemetation but your code is definetely more Pythonic and Pytorchy(if it is a real word)!
Thank for your nice solution.
However, I just wanted to make sure, if technically (scientifically) it is doing a cross entropy loss between two float tensors?
I meant we know CE is measuring the difference between two distribution. Is this code correctly doing that?
shouldn’t log softmax be applied on Dim 1 logsoftmax = nn.LogSoftmax(dim=1)
?
Same question as above
Usually yes. Note that the code you are referencing was written ~2 years ago and nn.LogSoftmax
used dim=1
as the default.
This is still true in the current release, but you’ll get a warning:
UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
1 Like