Unstable dice loss

skyunyoo · February 28, 2019, 6:29am

I am using U-Net for 1channel CT-scan images segmentation.
I don’t know why the training results are strange, so I ask questions.
My dice loss values are as follows.

Epoch 243/249
----------
train Loss: 0.021468 Dice Loss: 0.150230
valid Loss: 0.022236 Dice Loss: 0.104385

Epoch 244/249
----------
train Loss: 0.021527 Dice Loss: 0.110970
valid Loss: 0.022248 Dice Loss: 0.302684

Epoch 245/249
----------
train Loss: 0.021491 Dice Loss: 0.270226
valid Loss: 0.021186 Dice Loss: 0.204749

Epoch 246/249
----------
train Loss: 0.021418 Dice Loss: 0.271920
valid Loss: 0.021641 Dice Loss: 0.595716

Epoch 247/249
----------
train Loss: 0.021435 Dice Loss: 0.463512
valid Loss: 0.021218 Dice Loss: 0.306951

Epoch 248/249
----------
train Loss: 0.021401 Dice Loss: 0.117345
valid Loss: 0.021995 Dice Loss: 0.174188

Epoch 249/249
----------
train Loss: 0.021491 Dice Loss: 0.000062
valid Loss: 0.021817 Dice Loss: 0.369243

And the functions and training loops that I used are as follows.

def dice_loss(inputs,target):
    inter = (inputs * target).sum(-1).sum(-1)
    union = inputs.sum(-1).sum(-1) + target.sum(-1).sum(-1)
    result = (2 * inter / union).mean()
    return result

def normalize(x):
    return x / 255.0

criterion = nn.BCELoss()
def fit(epoch,model,data_loader,phase='train',volatile=False):
    if phase == 'train':
        exp_lr_scheduler.step()
        model.train()
    if phase == 'valid':
        model.eval()
        volatile=True
    running_loss = 0.0

    for batch_idx , (data,target) in enumerate(data_loader):
        inputs,target = data.cpu(),target.cpu()
        if is_cuda:
            inputs,target = data.cuda(),target.cuda()
        inputs , target = Variable(inputs,volatile),Variable(target)
        inputs, target = normalize(inputs),normalize(target)
        if phase == 'train':
            optimizer.zero_grad()
            
        output = model(inputs)
        output = torch.sigmoid(output)
        loss = criterion(output,target)  
        running_loss += loss.data.item()
        if phase == 'train':
            loss.backward()
            optimizer.step()
            
        dice = dice_loss(output,target)
    loss = running_loss/len(data_loader.dataset)
    
    print('{} Loss: {:.6f} Dice Loss: {:.6f}'.format(
                phase, loss, dice))
    return loss

The BCELoss value used for the criterion is output stable, but I don’t know why the dice loss value increases or decreases to almost zero.

For this matter, I thought it was a data distribution (because it contained some data with only a black background without a masked area - The goal is segment the liver, but data also contains images of areas without liver.) or a normalize problem, but I could not find a clear answer.