The issue is still open. What’s the optimal way of implementing Cross Entropy loss ?
Doing it like this means calculating the loss twice :-
def loss(y, targets):
temp = F.softmax(y)
loss = [-torch.log(temp[i][targets[i].data[0]]) for i in range(y.size(0))]
return F.cross_entropy(y, targets), loss