Faster R-CNN box coder

sigma_x · November 22, 2022, 2:30pm

At evaluation time FRCNN decodes boxes in the following way:

Get N candidates from RPN, dim Nx4
Filter them through RoIAlign to get N RoIs, dim NxKxHxW
Predict C+1 classes and box deltas from each RoI, dim Nx(C+1) and Nx(C+1)x4
Compare each (C+1) box predictions from each RoI to its source candidate
Thus obtain decoded boxes dim Nx(C+1)x4

What I don’t wholly understand about BoxCoder class (see _utils.py in detection module) is if it works for cases when I want to decode one set of box deltas wrt multiple references. In case there’s M deltas and R references, and I want to obtain R decoded boxes for each delta, i.e. the output will be dim MxRx4.

Is this possible?