Hi!
I’m working in Binary Classification with 1d Convolutional Network. For research purposes I’ve to implement my own Loss Function which is similar to BCELoss. For that reason I’ve started trying to implement my own BCE loss function with autograd:
class DiscriminatorLoss(torch.autograd.Function):
@staticmethod
def forward(ctx,d_out,labels):
loss = labels*torch.log(d_out)+(1-labels)*torch.log(1-d_out)
ctx.d_out,ctx.labels = input,labels
return loss
@staticmethod
def backward(ctx, grad_output):
d_out,labels = ctx.d_out,ctx.labels
grad_input = -labels/d_out + ((1-labels)/(1-d_out))
return grad_input,None
Where d_out
and labels
are tensor like:
d_out=tensor([[0.5412, 0.5225], | labels=tensor([[0, 1],
[0.5486, 0.5167], | [0, 1],
[0.5391, 0.5061],...])| [0, 1],...])
However, this doesn’t work properly. The problem is that in the middle of the training process, the outputs of the net (d_out
) turn to strange values like:
tensor([[9.9000e-08, 9.9000e-01],
[9.9000e-08, 9.9000e-01],
[9.9000e-08, 9.9000e-01],....])
And it get stuck there for the rest of the training.
I’ve also implemented the BCELoss function from Pytorch nn.BCELoss()
(https://pytorch.org/docs/stable/nn.html#loss-functions). And with that function, the net works so I believe that the problem is in my loss function. To be more exact, the forward() works well as it returns the same loss as nn.BCELoss. So the problem is in backward().
Can anyone help me? What I’m doing wrong in backward() function?
Thanks!
PS.: The outputs of the net are treat to not be exactly 0 or 1 in order to not generate NaN
and -inf
values in Cross Entropy Loss.