I trained a FasterRCNN model on 400 annotated hip xrays. I want that the model detects the areas around the hip joints. After training more than 15 epochs, I am getting very good losses and IoU for both training and validation datasets. But when I use GradCam to look how the model predicts the bounding boxes, I can see that the model considers Background areas of the xrays to make predictions. Examples are the black background on the side of the hip etc. Why does this happen and what can I do such that the model makes predictions based on the hip joint areas?
I would appreciate any help!