I am using U-Net for 1channel CT-scan images segmentation.
I don’t know why the training results are strange, so I ask questions.
My dice loss values are as follows.
Epoch 243/249
----------
train Loss: 0.021468 Dice Loss: 0.150230
valid Loss: 0.022236 Dice Loss: 0.104385
Epoch 244/249
----------
train Loss: 0.021527 Dice Loss: 0.110970
valid Loss: 0.022248 Dice Loss: 0.302684
Epoch 245/249
----------
train Loss: 0.021491 Dice Loss: 0.270226
valid Loss: 0.021186 Dice Loss: 0.204749
Epoch 246/249
----------
train Loss: 0.021418 Dice Loss: 0.271920
valid Loss: 0.021641 Dice Loss: 0.595716
Epoch 247/249
----------
train Loss: 0.021435 Dice Loss: 0.463512
valid Loss: 0.021218 Dice Loss: 0.306951
Epoch 248/249
----------
train Loss: 0.021401 Dice Loss: 0.117345
valid Loss: 0.021995 Dice Loss: 0.174188
Epoch 249/249
----------
train Loss: 0.021491 Dice Loss: 0.000062
valid Loss: 0.021817 Dice Loss: 0.369243
And the functions and training loops that I used are as follows.
def dice_loss(inputs,target):
inter = (inputs * target).sum(-1).sum(-1)
union = inputs.sum(-1).sum(-1) + target.sum(-1).sum(-1)
result = (2 * inter / union).mean()
return result
def normalize(x):
return x / 255.0
criterion = nn.BCELoss()
def fit(epoch,model,data_loader,phase='train',volatile=False):
if phase == 'train':
exp_lr_scheduler.step()
model.train()
if phase == 'valid':
model.eval()
volatile=True
running_loss = 0.0
for batch_idx , (data,target) in enumerate(data_loader):
inputs,target = data.cpu(),target.cpu()
if is_cuda:
inputs,target = data.cuda(),target.cuda()
inputs , target = Variable(inputs,volatile),Variable(target)
inputs, target = normalize(inputs),normalize(target)
if phase == 'train':
optimizer.zero_grad()
output = model(inputs)
output = torch.sigmoid(output)
loss = criterion(output,target)
running_loss += loss.data.item()
if phase == 'train':
loss.backward()
optimizer.step()
dice = dice_loss(output,target)
loss = running_loss/len(data_loader.dataset)
print('{} Loss: {:.6f} Dice Loss: {:.6f}'.format(
phase, loss, dice))
return loss
The BCELoss value used for the criterion is output stable, but I don’t know why the dice loss value increases or decreases to almost zero.
For this matter, I thought it was a data distribution (because it contained some data with only a black background without a masked area - The goal is segment the liver, but data also contains images of areas without liver.) or a normalize problem, but I could not find a clear answer.