Hello everyone.
I am using Faster R-CNN for object detection.
Since I only need it to detect vehicles, I am just filtering out labels of non-vehicle objects, however I would like the network to output scores and bounding boxes for vehicles only.
Mainly, I need to change the number of output features of model.roi_heads.box_predictor.bbox_pred
. Currently, it is a linear layer with out_features=364
, basically 4 outputs for each of the 91 classes of COCO dataset. However I would like to exploit the feature extraction at this stage to predict other values.
I’ll try to explain better with an example:
- Let’s say I am just interested in predicting ‘car’ class
car_label = 3
- I extract the 4 rows in the weights matrix to predict the bounding box (since I want to use pre-trained Faster R-CNN)
my_weights = model.roi_heads.box_predictor.bbox_pred.weight.data[4*car_label:4*(car_label+1), :]
my_bias = model.roi_heads.box_predictor.bbox_pred.bias.data[4*car_label:4*(car_label+1)]
- I want the bounding box predictor to regress another output (5 outputs instead of 4)
model.roi_heads.box_predictor.bbox_pred= nn.Linear(in_features=1024, out_features=5, bias=True)
- But I still want to have the bounding box, exploiting the pre-trained network
model.roi_heads.box_predictor.bbox_pred.weight.data[:4,:] = my_weights
model.roi_heads.box_predictor.bbox_pred.bias.data[:4] = my_bias
This obviously doesn’t work:
RuntimeError: The expanded size of the tensor (1) must match the existing size (2) at non-singleton dimension 1. Target sizes: [10000, 1]. Tensor sizes: [10000, 2]
But even if I set 4 output features only (out_features=4
) an error still exists because it removes any lower score box from the output:
/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/roi_heads.py in postprocess_detections(self, class_logits, box_regression, proposals, image_shapes)
502 # remove low scoring boxes
503 inds = torch.nonzero(scores > self.score_thresh).squeeze(1)
--> 504 boxes, scores, labels = boxes[inds], scores[inds], labels[inds]
505
506 # remove empty boxes
IndexError: index is out of bounds for dimension with size 0
Any idea on how to do something like this or how to delete the removal of low scoring boxes (since I actually only need one box only)? I also accept ideas on how to add another linear layer having, as input, the 1024 in_features
and providing a readable output, even though I prefer to delete this last large matrix multiplication since I don’t need it (and I tried to delete it following these steps but I had errors on the input of a conv
layer).
Thank you.