My training code is simply adding up all of the losses returned from the pretrained fasterrcnn model and sending that backwards, like this:
loss_dict = model(images, targets) loss_classifier = loss_dict['loss_classifier'] loss_box_reg = loss_dict['loss_box_reg'] loss_objectness = loss_dict['loss_objectness'] loss_rpn_box_reg = loss_dict['loss_rpn_box_reg'] total_loss = loss_classifier.mean() + loss_box_reg.mean() + loss_objectness.mean() + loss_rpn_box_reg.mean() total_loss.backward()
I’m only extracting and updating the final layer of the model for training. Is that the correct way to do it, or do I need to some bounding box regression on my own, or some other method?
The training on my own objects, with 1220 annotated images, is just not that great. It’s barely 50% correct, and that’s in easy images.