Loss.backward didn't update ,when i'm using my custom Loss function

lidaryani · November 5, 2018, 2:43pm

Hi everyone,I came across some problems about gradient update when training my network.i’m using u-net as my model,and i write a custom loss function as Dice Loss Function,

class DiceCoffLoss(nn.Module):

    def __init__(self, label_of_interest=1):
        super(DiceCoffLoss, self).__init__()
        self.labelInterest = label_of_interest

    def forward(self,prediction, segmentation ):
        
        """ inputs are 2d arrays """
        segmentation=torch.autograd.Variable(segmentation) 
        prediction=torch.autograd.Variable(prediction) 
       
        if prediction.shape != segmentation.shape:
            raise ValueError("Shape mismatch between given arrays. prediction %s vs segmentation %s" \
                         % (str(prediction.shape), str(segmentation.shape)))

        n_organ_seg =  (segmentation==self.labelInterest).sum()
        n_organ_pred= (prediction == self.labelInterest).sum()
        denominator = n_organ_pred + n_organ_seg
        if denominator == 0:
            return torch.tensor(1.0,requires_grad=True) #mask or predected is empty of interested label
        
        iflat = prediction.contiguous().view(-1)
        tflat = segmentation.contiguous().view(-1)
        organ_intersection=(iflat==self.labelInterest)*(tflat==self.labelInterest) #Subscription operator
        n_organ_intersection = organ_intersection.sum()
        dice = ((2.0*n_organ_intersection / denominator) ) 
        return torch.tensor(1-dice ,requires_grad=True)

i used my model with other Loss functions like nn.functional.cross_entropy everything was OK ,but when i’m using my own loss function loss.backward can not update model and model can’t train anymore.

I am still working on that and failed to figure out why the gradient didn’t update as usual. Any inspiration would be sincerely appreciated!!

ptrblck · November 5, 2018, 4:28pm

In the first lines of forward you are re-wrapping your predictions and targets in new Variables.
This will detach the tensors from the computation graph, so that no information will flow back to your model.
As Variables are deprecated since 0.4.0, you don’t need to warp tensors anymore.
However, even in older versions these lines would be problematic.
Just try to remove them or do you get any error, if you just use prediction and segmentation to calculate your dice loss?

lidaryani · November 5, 2018, 4:59pm

lidaryani:

class DiceCoffLoss(nn.Module): def init(self, label_of_interest=1): super(DiceCoffLoss, self).init() self.labelInterest = label_of_interest def forward(self,prediction, segmentation ): “”" inputs are 2d arrays “”" segmentation=torch.autograd.Variable(segmentation) prediction=torch.autograd.Variable(prediction) if prediction.shape != segmentation.shape: raise ValueError(“Shape mismatch between given arrays. prediction %s vs segmentation %s” \ % (str(prediction.shape), str(segmentation.shape))) n_organ_seg = (segmentation==self.labelInterest).sum() n_organ_pred= (prediction == self.labelInterest).sum() denominator = n_organ_pred + n_organ_seg if denominator == 0: return torch.tensor(1.0,requires_grad=True) #mask or predected is empty of interested label iflat = prediction.contiguous().view(-1) tflat = segmentation.contiguous().view(-1) organ_intersection=(iflat==self.labelInterest)(tflat==self.labelInterest) #Subscription operator n_organ_intersection = organ_intersection.sum() dice = ((2.0n_organ_intersection / denominator) ) return torch.tensor(1-dice ,requires_grad=True)

thank for your reply,yes you are right, I did a non-useful work on these two lines.I changed my code as below:

class DiceCoffLoss(nn.Module):

    def __init__(self, label_of_interest=1):
        super(DiceCoffLoss, self).__init__()
        self.labelInterest = label_of_interest

    def forward(self,prediction, segmentation ):
        
        """ inputs are 2d arrays """
    
        if prediction.shape != segmentation.shape:
            raise ValueError("Shape mismatch between given arrays. prediction %s vs segmentation %s" \
                         % (str(prediction.shape), str(segmentation.shape)))

        n_organ_seg =  (segmentation==self.labelInterest).sum()
        n_organ_pred= (prediction == self.labelInterest).sum()
        denominator = n_organ_pred + n_organ_seg
        if denominator == 0:
            return torch.tensor(1.0,requires_grad=True) #mask or predected is empty of interested label
        
        iflat = prediction.contiguous().view(-1)
        tflat = segmentation.contiguous().view(-1)
        organ_intersection=(iflat==self.labelInterest)*(tflat==self.labelInterest) #Subscription operator
        n_organ_intersection = organ_intersection.sum()
        dice = ((2.0*n_organ_intersection / denominator) ) 
        return torch.tensor(1-dice ,requires_grad=True)

I’ve restarted the program so far, nearly 30 epoches have been completed, but the amount of loss has not changed, even in the amount of Epsilon.
Are there any other wrong things?

ptrblck · November 5, 2018, 8:25pm

In the last line you are re-creating a tensor, thus detaching the tensor from the computation graph.
I’m not sure, what self.labelInterest is, but it seems to be some kind of filter to get the current class.
Could you try to remove the creation of a new tensor and just return (1- dice)?

lidaryani · November 20, 2018, 4:44pm

I apologize for the delay.
about self.labelInterest Yes you guessed right.I removed last re-wrapping but nothing changed.
I changed my code to :

import torch.nn.functional as F
import torch.nn as nn
class SoftDiceLoss(nn.Module):
    def __init__(self, label_of_interest=1, weight=None, size_average=True):
        super(SoftDiceLoss, self).__init__()
        self.labelInterest = label_of_interest

    def forward(self, logits, targets):
        smooth = 1
        num = targets.size(0)
         
        m1 =  logits.view(num, -1)
        m2 = targets.view(num, -1)
        intersection = ((m1==self.labelInterest) * (m2==self.labelInterest) )

        score = 2. * (intersection.sum() + smooth).float() / ((m1==self.labelInterest).sum() + (m2==self.labelInterest).sum() + smooth).float()
        score = 1 - score 
        score.requires_grad=True #I get an error without this line of code. It is necessary
         
        return score

but nothing changed ! I even changed the learning rate many times, but it did not have any effect .
what do think about override backward function?is it good idea for this problem?