How to compute loss between ground truth bounding box and bounding box computed from segmentation mask

I am trying to compute the loss between ground truth bounding box and bounding box which is generated from predicted segmentation mask. My function to compute the bounding box loss is as following:

import torch.nn as nn
import torchvision.ops as ops


def get_bounding_boxes_from_masks(segmentation_masks, device):
    bbox_tensor_list = []
    for i in range(segmentation_masks.shape[0]):
            if torch.sum(segmentation_masks[i]) == 0:
                bounding_boxes = torch.tensor([[0.0, 0.0, 1.0, 1.0]], dtype=torch.float32).to(device)
            else:
                bounding_boxes = ops.masks_to_boxes(segmentation_masks[i])
            bbox_tensor_list.append(bounding_boxes)
    concatenated_bbox = torch.cat(bbox_tensor_list, dim = 0)
    return concatenated_bbox

Once I have the bounding boxes I compute the loss between ground truth bounding boxes (bboxes) and bounding boxes from segmentation mask (mask_bboxes). My segmentation mask contains single object only.

mse_loss = nn.MSELoss()
for step, (image, gt, boxes) in enumerate(tqdm(train_dataloader)):
      bboxes = Variable(bboxes, requires_grad=False)
      
      # mask_predictions are coming from segmentation model 
      mask_predictions = model(image, boxes)
      
      mask_predictions = torch.sigmoid(mask_predictions) 
      binary_masks = mask_predictions > 0.5
      mask_bboxes = get_bounding_boxes_from_masks(binary_masks, device=device)
      mask_bboxes = Variable(mask_bboxes, requires_grad=True)
      loss = mse_loss(mask_bboxes.to(device), bboxes.to(device))
      print(loss.item(), loss.grad())
      optimizer.zero_grad()
      loss.backward()
      optimizer.step()
      epoch_loss += loss.item()

I am getting loss.grad as “None” and loss is constant after each epoch (not even decimal change). I am not able to understand what’s happening here. Can anyone help?

Let me know if any other information is required.

Thanks

Hello Aayushktyagi,
I suppose your code is breaking the gradient function. In the first if-statement you are creating a new tensor which seems to be detached from the segmentation mask, therefore, from the “whole” gradient function tracking that is required to update your model’s parameters.
To be sure, use a debugger and follow the gradient flow.

Best,
Luis

Hey @nwn I tried removing the if-statement but I am still getting the same issue i.e. loss is constant and loss.grad is None.
Is there anything else that can be tried?

Thanks
Aayush

I would start a debugger an set break point(s) after your forwards pass, then go line by line through your code and monitor the gradient functions/ check those for consistency.