Hello,
I’m trying to fine-tune Faster RCNN with the Resnet-50 backend via Torchvision, using some references I have arrived at a getitem call which looks like the following
def __getitem__(self, index: int):
file_name = self.file_names[index]
records = self.data[self.data['file_name'] == file_name]
image = np.array(Image.open(file_name), dtype=np.float32)
image /= 255.0
if self.transform:
image = self.transform(image)
if self.mode != "test":
boxes = records[['xmin', 'ymin', 'xmax', 'ymax']].values
area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
area = torch.as_tensor(area, dtype=torch.float32)
labels = torch.ones((records.shape[0],), dtype=torch.int64)
iscrowd = torch.zeros((records.shape[0],), dtype=torch.int64)
target = {}
target['boxes'] = boxes
target['labels'] = labels
target['image_id'] = torch.tensor([index])
target['area'] = area
target['iscrowd'] = iscrowd
target['boxes'] = torch.stack(list((map(torch.tensor, target['boxes'])))).type(torch.float32)
return image, target, file_name
else:
return image, file_name
This line labels = torch.ones((records.shape[0],), dtype=torch.int64)
assumes that there is only one other class and 0 in the case of Faster RCNN is reserved for the Background.
I have a dataset with multiple classes, and I am unable to figure out how to modify and pass the one hot encoded version to train the Faster RCNN model for a a multi-class scenario
Any advice would be great,
Thank you!