Predicting objects outside dataset using FasterRCNN


Im new to pytorch and im trying to do transfer learning over a faster rcnn torchvision model.

I read several tutorials and tried for days but i allways get same result but im not able to find whats wrong, sure more experiece people will detect error easy.

I have two classes (cat,dog), cat is class 1 and dog is class 2. so according doc i create my model as follow

num_clases = 3 # two classes plus background
model = fasterrcnn_resnet50_fpn(pretrained=True)
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_clases)

I have created my dataset, later a dataloader… etc etc and i also printed some random dataset and dataloader entries for sanity check… all looks ok.

I also plotted to tensorboard train loss and test loss and train for 20 epochs… sure can be better

Save the model, f"models/"+MODEL_NAME+"/"+MODEL_NAME+"_final.pth")

And when i do inference, any object detected is a cat or a dog, i mean if i use a cow image, model says its a cat or a dog. So it works perfect with any image of test set but as soon as i use any image that not contains a dog or a cat… problem appears as any object will be cat or dog.

Is like any bbox that comes from the backbone will be setted as one of the two classes, but as i read it should have a class 0 to put all background (anything that is not a cat or a dog).

Sure is something dumb but im not able to find the cause.

Any help will be wellcome!!!

Thanks a lot

Hi @natxo

Seems like you are trying to do a Out of sample prediction. So say currently you have a correctly trained FasterRCNN. And it will of course work well on cats / dogs. What you are trying to do is an out of sample prediction. That is trying to predict something that wasn’t part of your dataset.

You have 2 options, let me illustrate them :slight_smile:

Set confidence levels

So any such image of Cow, etc should have a low accuracy and you can probably set a threshold so that it ignores such predictions with low accuracy.

For e.g see Visualization utilities — Torchvision main documentation

Where we use FasterRCNN and set threshold to ignore low confidence results. That should help as an image of cow or an image of fridge, will probably have low confidence and you can just filter that out.

Hard negative sampling.

Well you remember you told of background class which is class index 0. This will help you to train model that specifically ignores particular data. You can try passing images of cow, and a zero tensor
torch.tensor([0.0, 0.0, 0.0, 0.0], dtype=torchf.float) as an image of cow that you want to be supplied as background. The model will learn to ignore this (thanks to class index being 0 and the box tensor too being with 0 cordinates)

Hope this helps you.

Aditya Oke

Hi @oke-aditya

Thanks for your detailed reply, i understand what you mean, let me add some info.

1) Im currently only drawing detection with a score >0.7 so setting confidence levels dont seems to work for me.

2) To do negative samples… not only those samples images i told like cow are detected as dog/cats, everything that is not a dog/cat is mattched to one of those two categories… seems like nothing is background. Would i need to add negatives sample for anything that is not a dog or a cat¿

So i think i’m doing something wrong, i have do the same on tf and yolo in the past and no issues.

Please let me ellaborate as maybe i dont understand correctly how transfer learning works here

Im using a pretrained model with a backbone trained with coco, this backbone will feed head the bbox of what model detects.

And i replaced a new head to classify those detections that backbone sent

model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_clases)

in that case, possible classes are (“dog”,“cat”), those are in my dataset as index 1 and 2, i have no backgound class in my dataset.

When i set num_classes to 3 classes then are (0,1,2) so first is the “background”, i was expecting that the model will mark all things that are not a cat or a dog as background… but seems everything is feeded to box_predictor is set as 1 or 2, so a cat or a dog

I followed several tutorials like this one

That use a much smaller dataset than mine, i have 1330 images for train and 168 for test

Thanks for your time!!!

Ok so confidence level is kind of hard to work for similar objects.

everything that is not a dog/cat is mattched to one of those two categories

Correct, that is valid, and hence the confidence. Try passing a fridge or say an apple. Does it say it says its dog with great confidence? Probably not.

i was expecting that the model will mark all things that are not a cat or a dog as background

Well not exactly. You need to supply a images with 0 tensors, so that they are marked as background. It won’t happen that model will just predict background for out of dataset objects. If you thought that’s what will happen for cow and you will get 0 as your class, then nope, that’s not how it will work.
It’s slightly complicated as why it won’t.

Ahh ok!

So to general “discard” things that are not cat and dogs i must play with confidence, even if i need to say >95 and to discard something concrete also provide it to dateset as a negative sample.


Thank you!!!

Yes, these are the two methods as far as I know. :smile: (Maybe there is something better, but I have used either of these and they work fine)

1 Like

Also let me know if above works :smile: and if it does, please mark as solution, so that other people might benefit. (P.S. A nice name for question maybe predicting objects outside dataset using FasterRCNN?)

Yep, you are right, sure others will have same question, i will confirm for sure but it will work.

I think im going to run test set, and put scores in a list to estimate score threshold… maybe helps

1 Like

I cannot set negative sample box on an image (for FasterRCNN training, same as the author has). I get the following error when I do:
ValueError: All bounding boxes should have positive height and width. Found invalid box [0.0, 0.0, 0.0, 0.0] for target at index 1.
Have you ever encountered the same problem and what did you do to fix it?
Thank you in advance!