RCNN object detection

AlexeyG · November 13, 2020, 12:12pm

Hello
I am beginner and I try to learn RCNN from TorchVision. I have 8 classes of objects, and i have wrote dataset class with this def__getitem and here labels is numpy array with classes for boxes on image.(for example ([3, 5]))

def getitem(self, index):
path_image = self.paths[index]
image, boxes , labels, area = self.load_image_and_boxes(index)
target = {}
target[‘boxes’] = boxes
target[‘labels’] = labels
target[‘image_id’] = torch.tensor([index])
target[‘area’] = area
if self.transforms:
sample = self.transforms(**{
‘image’: image,
‘bboxes’: target[‘boxes’],
‘labels’: labels
})
image = sample[“image”]
target[‘bboxes’] = torch.as_tensor(sample[‘bboxes’], dtype=torch.float32)
target[‘bboxes’] = target[‘bboxes’].reshape(-1, 4)
target[“boxes”] = target[“bboxes”]
del target[‘bboxes’]
else:
target[‘boxes’] = torch.as_tensor(target[‘boxes’], dtype=torch.float32)
target[‘boxes’] = target[‘boxes’].reshape(-1, 4)
image = np.transpose(image,(2,0,1))
image = torch.from_numpy(image)
return image, target, path_image

After lerniing RCNN return outputs, with 8 items. This is one of this
outputs[7]

{‘boxes’: tensor([[ 0.0000, 64.0552, 500.5320, 477.4697],
[ 22.2902, 121.1948, 360.7867, 475.3104],
[ 6.6014, 123.5036, 282.5276, 460.8509],
[206.0831, 77.8670, 505.2312, 443.2711],
[ 10.5778, 97.3200, 489.6339, 433.8982],
[ 16.5061, 66.0151, 495.6974, 367.8644],
[ 0.0000, 264.9781, 240.3872, 496.7976],
[141.8789, 51.5878, 508.8718, 501.8432],
[244.2009, 45.8100, 512.0000, 479.3023],
[243.4653, 92.4951, 512.0000, 474.3302],
[287.1657, 273.5756, 483.0078, 463.3998],
[ 14.1116, 115.1430, 275.1363, 383.5063],
[ 0.0000, 118.9819, 428.0742, 509.8760],
[336.0939, 133.6007, 501.2955, 395.3423],
[331.1056, 110.4328, 497.5609, 425.7774],
[259.7857, 140.4948, 505.6891, 472.5172],
[ 36.9824, 208.6160, 237.9554, 512.0000],
[ 19.1136, 186.5290, 151.8781, 499.3608],
[213.0657, 102.6356, 474.4019, 215.7374],
[ 0.0000, 266.3952, 198.0168, 487.8173],
[ 23.2956, 125.9192, 169.1113, 444.2588],
[ 64.2475, 91.6519, 478.7527, 428.1402],
[330.7071, 115.4189, 491.4273, 420.2543],
[ 0.0000, 96.2847, 219.0983, 512.0000],
[ 0.0000, 478.2526, 189.6664, 512.0000],
[ 11.6199, 131.2440, 319.1119, 458.8158]], grad_fn=),
‘labels’: tensor([2, 7, 2, 7, 6, 7, 2, 4, 2, 1, 4, 6, 4, 1, 7, 4, 4, 2, 4, 7, 6, 1, 2, 7,
3, 1]),
‘scores’: tensor([0.4158, 0.2980, 0.2887, 0.2255, 0.1818, 0.1811, 0.1805, 0.1661, 0.1560,
0.1539, 0.1526, 0.1276, 0.1204, 0.1092, 0.0959, 0.0930, 0.0845, 0.0823,
0.0796, 0.0788, 0.0780, 0.0750, 0.0691, 0.0663, 0.0524, 0.0515],
grad_fn=)}

So, i thing i made wrong in __getitem, and in outputs in each item should have boxes belongs only one class. But i dont know how to rewrite it properly