Problem in backward() function in customized loss function (autograd.Function)

trec_nano · April 14, 2019, 4:31pm

Hi!
I’m working in Binary Classification with 1d Convolutional Network. For research purposes I’ve to implement my own Loss Function which is similar to BCELoss. For that reason I’ve started trying to implement my own BCE loss function with autograd:

  class DiscriminatorLoss(torch.autograd.Function):
          @staticmethod
          def forward(ctx,d_out,labels):
                loss = labels*torch.log(d_out)+(1-labels)*torch.log(1-d_out)
                ctx.d_out,ctx.labels = input,labels
                return loss 
          
          @staticmethod
          def backward(ctx, grad_output):
                d_out,labels = ctx.d_out,ctx.labels
                grad_input = -labels/d_out + ((1-labels)/(1-d_out))
                return grad_input,None

Where d_out and labels are tensor like:

d_out=tensor([[0.5412, 0.5225],     | labels=tensor([[0, 1], 
              [0.5486, 0.5167],     |                [0, 1],
              [0.5391, 0.5061],...])|                [0, 1],...])

However, this doesn’t work properly. The problem is that in the middle of the training process, the outputs of the net (d_out) turn to strange values like:

      tensor([[9.9000e-08, 9.9000e-01],
              [9.9000e-08, 9.9000e-01],
              [9.9000e-08, 9.9000e-01],....])

And it get stuck there for the rest of the training.

I’ve also implemented the BCELoss function from Pytorch nn.BCELoss() (https://pytorch.org/docs/stable/nn.html#loss-functions). And with that function, the net works so I believe that the problem is in my loss function. To be more exact, the forward() works well as it returns the same loss as nn.BCELoss. So the problem is in backward().

Can anyone help me? What I’m doing wrong in backward() function?

Thanks!

PS.: The outputs of the net are treat to not be exactly 0 or 1 in order to not generate NaN and -inf values in Cross Entropy Loss.