Training with my Focal Loss Layer doesn't work

I’m beginner of pytorch :fire: It’s my first question.

I want to use focal loss in my research.
I’m struggling to apply focal loss into multi-class segmentation problem. I implement the loss function but it doesn’t work. Implementation is here.(https://github.com/doiken23/pytorch_toolbox/blob/master/focal_loss_multiclass.py

I can do training with my focal loss layer, but behavior may be strange. For example, when γ=0 focal loss is same to cross-entropy loss, but loss curve is very different (in training with focal loss layer, difference of accuracy between training and validation is very large.)
I confirm that output of two loss layer is same if input is same, but I can’t find the reason. Please help me…

Are the gradients the same?

Thank you for your advice.

No, I didn’t. I will try to check, and how can I do??? Would you tell me how to check the gradients?

Best regards.

One thing you can do is generate a random input and a random target and compute gradients for the input:

import torch
import torch.nn.functional as F
from torch.autograd import Variable

N = 5  # minibatch
C = 3 # class
input = Variable(torch.randn(N, C), requires_grad=True)
target = Variable(torch.zeros(N).random_(0, C).long())
loss = cross_entropy(input, target)
loss.backward()
print(input.grad)

And repeat, replacing `cross_entropy with your loss.

1 Like

Thank you for detail reply.
I checked the gradient. Gradients is same, but training is very different.
I have no idea :sweat::sweat: Is there any other possible cause?

N = 5
C = 3

input = Variable(torch.randn(N, C), requires_grad=True)
target = Variable(torch.zeros(N).random_(0, C).long())
loss = nn.NLLLoss()(F.log_softmax(input), target.view(N))
print(loss)
loss.backward()
print(input.grad)

Variable containing:
 1.8412
[torch.FloatTensor of size 1]

Variable containing:
 0.0994 -0.1139  0.0145
 0.1437  0.0180 -0.1618
 0.0343 -0.1474  0.1131
-0.1896  0.1606  0.0290
 0.0788 -0.1821  0.1033
[torch.FloatTensor of size 5x3]

input.grad = input.grad * 0
loss = MultiClassFocalLoss(gamma=0)(input, target)
print(loss)
loss.backward()
print(input.grad)
Variable containing:
 1.8412
[torch.FloatTensor of size 1]

Variable containing:
 0.0994 -0.1139  0.0145
 0.1437  0.0180 -0.1618
 0.0343 -0.1474  0.1131
-0.1896  0.1606  0.0290
 0.0788 -0.1821  0.1033
[torch.FloatTensor of size 5x3]