I have been working on rotated object detection with Faster R-CNN on aerial imagery for some time and encountered with two different approaches for producing rotated bounding boxes. The first approach is modifies RPN network of Faster R-CNN to produce inclined bounding boxes and then applying rotated bounding box regression to refine final boxes as explained here. The second approach is using RPN network for generating axis aligned boxes and adds an additional regression branch to classification head of Faster R-CNN to produce final rotated boxes as explained here.
I am little bit confused about these approaches. Some questions I have;
- What is the difference of these methods in detail?
- Does choosing either of these approaches affects detection results on aeriel imagery too much?