I am new in the field of object detection, I will be grateful if you could help me to reduce the number of detected objects in a pre-trained model that is trained on the coco dataset. I want only to detect “person” and “dog”.
I am using fasterrcnn_resnet50_fpn model:
#load mode
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
I am not sure who should I modify the model to only detect class 1 (person) and class 18 (dog). I do not want to train the model on new data.
I will appreciate it if you can help me with this problem.
Thank you
Keep in mind that if you want to use the coco-pretrained backbone and rpn, and then train with a new ROI head (for different class structure), you can simply take the backbone and rpn, and use those to initialize a new faster r-cnn module by passing in the backbone and rpn as arguments to faster-rcnn in torchvision.
thank you for your reply. But how can specify that I am only interested in class 1 (person) & class 18 (dog)? I do not want to train again the fasterrcnn_resnet50_fpn model. I want that the model only detects my wished classes and ingnores other 88 classes.
I tried to set the num_classes to 3 but get an error.
My code:
RuntimeError: Error(s) in loading state_dict for FasterRCNN:
size mismatch for roi_heads.box_predictor.cls_score.weight: copying a param with shape torch.Size([91, 1024]) from checkpoint, the shape in current model is torch.Size([3, 1024]).
size mismatch for roi_heads.box_predictor.cls_score.bias: copying a param with shape torch.Size([91]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for roi_heads.box_predictor.bbox_pred.weight: copying a param with shape torch.Size([364, 1024]) from checkpoint, the shape in current model is torch.Size([12, 1024]).
size mismatch for roi_heads.box_predictor.bbox_pred.bias: copying a param with shape torch.Size([364]) from checkpoint, the shape in current model is torch.Size([12]).
Thank you for your answer. I do not want to train the model with a new dataset. I want just that model detects my wishes classes which are available in the coco dataset:
coco_names = ['__background__', 'person', 'dog']
and ignores other classes.
I am tried to set the num_classes to 3 by loading the model it did not work.
Ok the error is because you cannot load in the pretrained model and set num classes to a different value because then the weights won’t match. Or at least it didn’t work for me either. You can try this however:
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
model.roi_heads.box_predictor.cls_score = nn.Linear(1024,len(coco_names)
@Dwight_Foster Hi, I know it’s been some time since this post has been active.
But I tried your method and I have some doubts:
model.roi_heads.box_predictor.cls_score = nn.Linear(1024,len(coco_names). Here we are just telling our model to predict for 3 classes but how does the model know that the classes should be ‘background’, ‘person’ and a ‘dog’?
When you initialize the model it does not know which class is which. That is the whole point of training the model is trying to learn which classes are which.