# Loss average is staying at nan always during model training

Hello,

I have my IOU loss function written and the model is always showing training and validation loss as nan. Please note that when i am switching over to cross entropy loss function then model training is working fine. So may me my loss function is something wrong? I have also printed loss value and average loss value in my loss function which are showing very less value…Can you all please help me on this? Thanks

Loss Function Code

class IntersectionOverUnion(nn.Module):

``````"""

Implementation of the Soft-Dice Loss function.

Arguments:

num_classes (int): number of classes.

eps (float): value of the floating point epsilon.

"""

def __init__(self, num_classes, eps=1e-5):

super().__init__()

# init class fields

self.num_classes = num_classes

self.eps = eps

# define the forward pass

def forward(self, preds, targets):  # pylint: disable=unused-argument

"""

Compute Soft-Dice Loss.

Arguments:

preds (torch.FloatTensor):

tensor of predicted labels. The shape of the tensor is (B, num_classes, H, W).

targets (torch.LongTensor):

tensor of ground-truth labels. The shape of the tensor is (B, 1, H, W).

Returns:

mean_loss (float32): mean loss by class  value.

"""

loss = 0

# iterate over all classes

for cls in range(self.num_classes):
# get ground truth for the current class
target = (targets == cls).float()
# get prediction for the current class
pred = preds[:, cls]
# calculate intersection
intersection = (pred * target).sum()  # Will be zero if Truth=0 or Prediction=0
## calculate union for the current class
union = (pred + target).sum() # Will be zzero if both are 0
# compute dice coefficient
# iou = (2 * intersection + self.eps) / (pred.sum() + target.sum() + self.eps)
iou = (intersection + self.eps) / (union + self.eps) # We smooth our devision to avoid 0/0
print("IOU Value:",iou)
# compute negative logarithm from the obtained
loss = loss - iou.log()
print("loss Value:",iou)
# get mean loss by class value
loss = loss / self.num_classes
print("loss Avg Value:",iou)
return loss
``````

Model Training Result:

``````> epoch: 2, test_miou: 0.090242, train_loss: nan, test_loss: nan: 25%
> 3/12 [1:22:27<3:35:14, 1434.99s/it]
> [0/12][Train] Loss_avg: nan, Loss: nan, LR: 1e-05: 100%
> 262/262 [1:11:44<00:00, 16.43s/it]
> Streaming output truncated to the last 5000 lines.
> IOU Value: tensor(-0.0143487109, device='cuda:0', grad_fn=<DivBackward0>)
> loss Value: tensor(-0.0143487109, device='cuda:0', grad_fn=<DivBackward0>)
> IOU Value: tensor(0.0014381389, device='cuda:0', grad_fn=<DivBackward0>)
> loss Value: tensor(0.0014381389, device='cuda:0', grad_fn=<DivBackward0>)
> IOU Value: tensor(-0.0324460752, device='cuda:0', grad_fn=<DivBackward0>)
> loss Value: tensor(-0.0324460752, device='cuda:0', grad_fn=<DivBackward0>)
> IOU Value: tensor(0.0008592299, device='cuda:0', grad_fn=<DivBackward0>)
> loss Value: tensor(0.0008592299, device='cuda:0', grad_fn=<DivBackward0>)
> IOU Value: tensor(-3.3268113264e-10, device='cuda:0', grad_fn=<DivBackward0>)
> loss Value: tensor(-3.3268113264e-10, device='cuda:0', grad_fn=<DivBackward0>)
> IOU Value: tensor(-1.4262332426e-09, device='cuda:0', grad_fn=<DivBackward0>)
> loss Value: tensor(-1.4262332426e-09, device='cuda:0', grad_fn=<DivBackward0>)
``````

Could you post the code snippet, which prints:

``````Loss_avg: nan, Loss: nan, LR: 1e-05: 100%
``````

I cannot find anything obviously wrong in your current code snippet and the output also seems to return valid values.