Error while using retinanet with resnet50fpn backbone

I tried to use this model for my object detection task

model =  retinanet_resnet50_fpn(pretrained=False, progress=True,
                           num_classes=num_classes, pretrained_backbone=pretrained_backbone)

Reference: vision/retinanet.py at master · pytorch/vision · GitHub

This error appeared. Help!

Epoch: [0]  [  0/457]  eta: 0:19:42  lr: 0.000032  loss: 1.8015 (1.8015)  classification: 1.1287 (1.1287)  bbox_regression: 0.6729 (0.6729)  time: 2.5873  data: 0.5801  max mem: 4855
Traceback (most recent call last):
  File "/raid/sahil_g_ma/wheatDetection/FRCNN_Resnet_training.py", line 99, in <module>
    train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=100)
  File "/raid/sahil_g_ma/wheatDetection/detection/engine.py", line 30, in train_one_epoch
    loss_dict = model(images, targets)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/models/detection/retinanet.py", line 547, in forward
    losses = self.compute_loss(targets, head_outputs, anchors)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/models/detection/retinanet.py", line 411, in compute_loss
    return self.head.compute_loss(targets, head_outputs, anchors, matched_idxs)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/models/detection/retinanet.py", line 51, in compute_loss
    'classification': self.classification_head.compute_loss(targets, head_outputs, matched_idxs),
  File "/opt/conda/lib/python3.6/site-packages/torchvision/models/detection/retinanet.py", line 120, in compute_loss
    ] = 1.0
IndexError: tensors used as indices must be long, byte or bool tensors

Also How can I use resnet152fpn as backbone of retinanet?

The second-to-last line in your log seems to use the targets variable, which you pass in your code in line 99 of FRCNN_Resnet_training.py. Maybe targets is in the wrong format? It needs to be an int or byte tensor, but it is a float tensor?

one of the targets was in float dtype. But then I tried changing all to int64. Still, the error remains same.

# Convert everything into a torch.Tensor
        boxes = torch.as_tensor(boxes, dtype=torch.int64) 
        
        # Get the labels. We have only one class (wheat head)
        labels = torch.ones((n_objects, ), dtype=torch.int64)
        
        areas = torch.as_tensor(areas, dtype=torch.int64)
        
        # suppose all instances are not crowd
        iscrowd = torch.zeros((n_objects, ), dtype=torch.int64)
        
        if(n_objects == 0):
          boxes = torch.zeros((0,4), dtype=torch.int64)

        target = {
            'boxes': boxes,
            'labels': labels,
            'image_id': torch.tensor([index], dtype=torch.int64),
            'area': areas,
            'iscrowd': iscrowd
        }
        # if self.transform is not None:
        image = self.transform(image)

        return image, target

Actually it is related to version of torchvision. Earlier I was using 0.8.0a0, changing to 0.9.0a0 resolved the issue