[Caffe2] Operator(s) for batch-related tensor rearrange

erickim555 · April 9, 2019, 11:11pm

Hello, I am trying to perform the following transformation, ideally using existing Caffe2 operators:

Inputs:
  boxes.shape=[nb_boxes, 6]
    where each row boxes[i, :] -> [x1, y1, x2, y2, float confidence, int category_index].
    notably, boxes is arranged sequentially by image
  batch_splits.shape=[nb_images]
    Denotes nb of boxes for each image in the batch.
    Example: if batch_splits = [42, 3], then the first 42 boxes in `boxes` belong to the first image, and the next 3 boxes belong to the second image.
Output:
  boxes_by_batch.shape=[nb_images, nb_boxes_max, 6]
    A rearrangement of boxes, but in a familiar batch-style format, eg boxes_by_batch[0, :, :] contains boxes for image 0, boxes_by_batch[1, :, :] contains boxes for image 1, etc. 
    This is zero-filled: important, because each image may have different nb of detected boxes. The second dimension nb_boxes_max is the largest nb of boxes in an image in the batch.

Here is a specific example + code that better illustrates what I’m trying to do:

import numpy as np
from caffe2.python import workspace

# [1/3] Initialize input data
boxes = np.array([
    [10, 20, 100, 200, 0.75, 1],
    [20, 30, 50, 70, 0.95, 3],
    [80, 100, 25, 25, 0.5, 2]
], dtype=np.float32)
# batch_splits indicates that first two boxes belong to img0, and third box
# belongs to img1
batch_splits = np.array([2, 1], dtype=np.float32)
workspace.FeedBlob("boxes", boxes)
workspace.FeedBlob("batch_splits", batch_splits)

# [2/3] Add op(s) that emit boxes_by_batch
ops = [] # TODO: fill me!
workspace.RunOperatorsOnce(ops)

# [3/3] Fetch boxes_by_batch, compare to desired output
boxes_by_batch = workspace.FetchBlob("boxes_by_batch")
boxes_by_batch_desired = np.array([
    [
        [10, 20, 100, 200, 0.75, 1],
        [20, 30, 50, 70, 0.95, 3]
    ],
    [
        [80, 100, 25, 25, 0.5, 2],
        [0, 0, 0, 0, 0, 0]
    ]
])
print("same? {}".format(np.allclose(boxes_by_batch, boxes_by_batch_desired)))

Context: I’m trying to serve a Detectron object detection model (FRCNN+FRN) with batch inference enabled. The Detectron model emits detected boxes in the boxes, batch_splits format, but for my usecase it’d be easier if the network instead emitted the detected boxes in boxes_by_batch format.
Ideally, I’d like to use existing Caffe2 operators so that I don’t have to write a custom C++ operator (and deal with linking it to our prod env, etc).

Thank you!

erickim555 · April 11, 2019, 11:27pm

Update: I was able to solve this using built-in Caffe2 operators. The While and If operators, which I wasn’t aware of, were extremely useful: here is a great tutorial that helped me understand how to use them: https://github.com/caffe2/tutorials/blob/master/Control_Ops.ipynb