Finetuning FasterRCNN for multi-class examples

Hello,

I’m trying to fine-tune Faster RCNN with the Resnet-50 backend via Torchvision, using some references I have arrived at a getitem call which looks like the following


def __getitem__(self, index: int):

        file_name = self.file_names[index]
        records = self.data[self.data['file_name'] == file_name]
        
        image = np.array(Image.open(file_name), dtype=np.float32)
        image /= 255.0

        if self.transform:
            image = self.transform(image)  
            
        if self.mode != "test":
            boxes = records[['xmin', 'ymin', 'xmax', 'ymax']].values
            
            area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
            area = torch.as_tensor(area, dtype=torch.float32)

            labels = torch.ones((records.shape[0],), dtype=torch.int64)
            
            iscrowd = torch.zeros((records.shape[0],), dtype=torch.int64)
            
            target = {}

            target['boxes'] = boxes
            target['labels'] = labels
            target['image_id'] = torch.tensor([index])
            target['area'] = area
            target['iscrowd'] = iscrowd 
            target['boxes'] = torch.stack(list((map(torch.tensor, target['boxes'])))).type(torch.float32)

            return image, target, file_name
        else:
            return image, file_name

This line labels = torch.ones((records.shape[0],), dtype=torch.int64) assumes that there is only one other class and 0 in the case of Faster RCNN is reserved for the Background.

I have a dataset with multiple classes, and I am unable to figure out how to modify and pass the one hot encoded version to train the Faster RCNN model for a a multi-class scenario

Any advice would be great,

Thank you!