Implementing DIoU, calculate grad explicitly


I’ve implemented the DIoU loss as described in this paper. Unfortunately I’m getting the error grad can be implicitly created only for scalar outputs.

Following this thread I suppose this is because my returned loss has the shape (B x Loss) (so one loss for each element in Batch). My loss function returns the following 1 - self.bbox_iou(y, y_hat) + self.bbox_diou_penalty(y, y_hat).

I could use something like torch.mean() or torch.sum() to solve this problem. However, this would presumable result in bad predictions. From my rudimentary understanding I guess there is a way to calculate the grad explicitly, using the requires_grad Flag, but how exactly do I do this?


If you are reducing the loss and call .backward() on it implicitly a torch.ones(1) gradient will be passed as the gradient argument to backward. Since you are dealing with multiple elements in your loss, you could use e.g. loss.backward(gradient=torch.ones_like(loss)) instead.