In faster rcnn, RoI’s are in reshaped to (batch, 256, 7, 7)
and then will undergo 2 shared convs
for bbox and cls prediction. This will result in shape (batch, 1024) before it outputs the bbox pred (batch, 4
) and cls pred (batch, 2)
.
Is there a way to transform the features from (batch, 256, 7, 7)
to something (batch, w_pred, h_pred)
, so that I can get the feature values inside the bounding box?
w_pred
and h_pred
are width and height of the predicted bounding boxes.
or is there an easier way to get the feature values from the box predictions?