Runtime error on program

kurianbenoy · July 8, 2019, 5:58pm

  File "/home/kurian/Projects/pytorch-ssd/.env/lib/python3.7/site-packages/torch/nn/functional.py", line 1871, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: invalid argument 2: dimension -1 out of range of 2D tensor at /pytorch/aten/src/TH/generic/THTensor.cpp:37

I am getting a very strange error, which on googling does not lead to any solution. There is also a deprecated warning which I feel has some relation to this error message:

/home/kurian/Projects/pytorch-ssd/.env/lib/python3.7/site-packages/torch/nn/_reduction.py:46: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
  warnings.warn(warning.format(ret))

ptrblck · July 8, 2019, 10:44pm

Could you print the shapes of input and target to your criterion?
The code you’ve used to initialize the criterion might also be interesting to see.

The warning just states, that you are using a deprecated method (size_average or reduce) instead of the new reduction argument.

kurianbenoy · July 9, 2019, 2:30am

@ptrblck the criterion is defined as:

    criterion = MultiboxLoss(config.priors, iou_threshold=0.5, neg_pos_ratio=3,
                             center_variance=0.1, size_variance=0.2, device=DEVICE)

WHereas Multibox is defined as a class as shown here:

import torch.nn as nn
import torch.nn.functional as F
import torch


from ..utils import box_utils


class MultiboxLoss(nn.Module):
    def __init__(self, priors, iou_threshold, neg_pos_ratio,
                 center_variance, size_variance, device):
        """Implement SSD Multibox Loss.

        Basically, Multibox loss combines classification loss
         and Smooth L1 regression loss.
        """
        super(MultiboxLoss, self).__init__()
        self.iou_threshold = iou_threshold
        self.neg_pos_ratio = neg_pos_ratio
        self.center_variance = center_variance
        self.size_variance = size_variance
        self.priors = priors
        self.priors.to(device)

    def forward(self, confidence, predicted_locations, labels, gt_locations):
        """Compute classification loss and smooth l1 loss.

        Args:
            confidence (batch_size, num_priors, num_classes): class predictions.
            locations (batch_size, num_priors, 4): predicted locations.
            labels (batch_size, num_priors): real labels of all the priors.
            boxes (batch_size, num_priors, 4): real boxes corresponding all the priors.
        """
        num_classes = confidence.size(2)
        with torch.no_grad():
            # derived from cross_entropy=sum(log(p))
            loss = -F.log_softmax(confidence, dim=2)[:, :, 0]
            mask = box_utils.hard_negative_mining(loss, labels, self.neg_pos_ratio)

        confidence = confidence[mask, :]
        classification_loss = F.cross_entropy(confidence.reshape(-1, num_classes), labels[mask], reduction='sum')
        pos_mask = labels > 0
        predicted_locations = predicted_locations[pos_mask, :].reshape(-1, 4)
        gt_locations = gt_locations[pos_mask, :].reshape(-1, 4)
        smooth_l1_loss = F.smooth_l1_loss(predicted_locations, gt_locations, reduction=False)
        num_pos = gt_locations.size(0)
        return smooth_l1_loss/num_pos, classification_loss/num_pos

I will check out the input and target to the criterion

kurianbenoy · July 9, 2019, 3:50am

On checking the inputs passed by the function:

regression_loss, classification_loss = criterion(confidence, locations, labels, boxes)

I realised locations is a zero tensor as shown here:
tensor([[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0],
…,
[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0]])

ptrblck · July 9, 2019, 10:38pm

Does this explain the error?
If not, could you also print the shapes of both tensors passed to F.cross_entropy ?