I can do training with my focal loss layer, but behavior may be strange. For example, when γ=0 focal loss is same to cross-entropy loss, but loss curve is very different (in training with focal loss layer, difference of accuracy between training and validation is very large.)
I confirm that output of two loss layer is same if input is same, but I can’t find the reason. Please help me…
One thing you can do is generate a random input and a random target and compute gradients for the input:
import torch
import torch.nn.functional as F
from torch.autograd import Variable
N = 5 # minibatch
C = 3 # class
input = Variable(torch.randn(N, C), requires_grad=True)
target = Variable(torch.zeros(N).random_(0, C).long())
loss = cross_entropy(input, target)
loss.backward()
print(input.grad)
And repeat, replacing `cross_entropy with your loss.
Thank you for detail reply.
I checked the gradient. Gradients is same, but training is very different.
I have no idea Is there any other possible cause?