Training Faster RCNN with small rectangular images

I am trying to train Faster RCNN with a MobileNet V2 backbone in PyTorch. I am running into some issues and I think it is due to my input images. My input images range from 40x40 px to 50x600 px. I am trying to predict just one class and I don’t need very tight bounding boxes, just has to include the entire object. I know that ROI pooling in Faster RCNN should help with my varied input images but maybe there are some hyperparameters that I should adjust.
I did adjust the anchor boxes to this:
anchor_generator = AnchorGenerator(sizes=((16, 32, 64,),), aspect_ratios=((0.4, 0.5, 1.0,),))

Let me know if you have any advice or input that would help my training!