I define a loss function by myself, but the duration of loss.backward() is very long.
The code is pasted below. The training speed is apprently slower than tensorflow implementation.Is the reason that I define the loss function in not proper method.
def func(logits, target):
log_likelihood = -F.log_softmax(logits, dim=1)
batch = logits.shape[0]
loss = torch.sum(torch.mul(log_likelihood, target)) / batch
return loss