Training with my Focal Loss Layer doesn't work

Thank you for detail reply.
I checked the gradient. Gradients is same, but training is very different.
I have no idea :sweat::sweat: Is there any other possible cause?

N = 5
C = 3

input = Variable(torch.randn(N, C), requires_grad=True)
target = Variable(torch.zeros(N).random_(0, C).long())
loss = nn.NLLLoss()(F.log_softmax(input), target.view(N))
print(loss)
loss.backward()
print(input.grad)

Variable containing:
 1.8412
[torch.FloatTensor of size 1]

Variable containing:
 0.0994 -0.1139  0.0145
 0.1437  0.0180 -0.1618
 0.0343 -0.1474  0.1131
-0.1896  0.1606  0.0290
 0.0788 -0.1821  0.1033
[torch.FloatTensor of size 5x3]

input.grad = input.grad * 0
loss = MultiClassFocalLoss(gamma=0)(input, target)
print(loss)
loss.backward()
print(input.grad)
Variable containing:
 1.8412
[torch.FloatTensor of size 1]

Variable containing:
 0.0994 -0.1139  0.0145
 0.1437  0.0180 -0.1618
 0.0343 -0.1474  0.1131
-0.1896  0.1606  0.0290
 0.0788 -0.1821  0.1033
[torch.FloatTensor of size 5x3]