How to get the features from the bbox prediction

In faster rcnn, RoI’s are in reshaped to (batch, 256, 7, 7) and then will undergo 2 shared convs for bbox and cls prediction. This will result in shape (batch, 1024) before it outputs the bbox pred (batch, 4) and cls pred (batch, 2).

Is there a way to transform the features from (batch, 256, 7, 7) to something (batch, w_pred, h_pred), so that I can get the feature values inside the bounding box?

w_pred and h_pred are width and height of the predicted bounding boxes.

or is there an easier way to get the feature values from the box predictions?