I am using Faster R-CNN for object detection.
Since I only need it to detect vehicles, I am just filtering out labels of non-vehicle objects, however I would like the network to output scores and bounding boxes for vehicles only.
Mainly, I need to change the number of output features of
model.roi_heads.box_predictor.bbox_pred. Currently, it is a linear layer with
out_features=364, basically 4 outputs for each of the 91 classes of COCO dataset. However I would like to exploit the feature extraction at this stage to predict other values.
I’ll try to explain better with an example:
- Let’s say I am just interested in predicting ‘car’ class
car_label = 3
- I extract the 4 rows in the weights matrix to predict the bounding box (since I want to use pre-trained Faster R-CNN)
my_weights = model.roi_heads.box_predictor.bbox_pred.weight.data[4*car_label:4*(car_label+1), :] my_bias = model.roi_heads.box_predictor.bbox_pred.bias.data[4*car_label:4*(car_label+1)]
- I want the bounding box predictor to regress another output (5 outputs instead of 4)
model.roi_heads.box_predictor.bbox_pred= nn.Linear(in_features=1024, out_features=5, bias=True)
- But I still want to have the bounding box, exploiting the pre-trained network
model.roi_heads.box_predictor.bbox_pred.weight.data[:4,:] = my_weights model.roi_heads.box_predictor.bbox_pred.bias.data[:4] = my_bias
This obviously doesn’t work:
RuntimeError: The expanded size of the tensor (1) must match the existing size (2) at non-singleton dimension 1. Target sizes: [10000, 1]. Tensor sizes: [10000, 2]
But even if I set 4 output features only (
out_features=4) an error still exists because it removes any lower score box from the output:
/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/roi_heads.py in postprocess_detections(self, class_logits, box_regression, proposals, image_shapes) 502 # remove low scoring boxes 503 inds = torch.nonzero(scores > self.score_thresh).squeeze(1) --> 504 boxes, scores, labels = boxes[inds], scores[inds], labels[inds] 505 506 # remove empty boxes IndexError: index is out of bounds for dimension with size 0
Any idea on how to do something like this or how to delete the removal of low scoring boxes (since I actually only need one box only)? I also accept ideas on how to add another linear layer having, as input, the 1024
in_features and providing a readable output, even though I prefer to delete this last large matrix multiplication since I don’t need it (and I tried to delete it following these steps but I had errors on the input of a