Forward function for object detection or instance segmentation

The forward function expects batched input, accordingly, we do not add a for loop for each sample inside it. However, in object detection or instance segmentation, each sample in the batch might have several instances. In this latter case, should I add a for loop for each instance inside my forward function and then concatenate the result?

For simplicity, Imagine the case of object detection. After the region proposal, we applied the ROI. Now for each image in the batch, I have the corresponding ROI for each proposal. The forward function will take the batch of images, so should I loop each ROI? how usually this issue is tackled. I tried to check the code for faster RCNN but I got a bit confused.