How to get mask from proposals and classification scores fast

Absurd · June 4, 2020, 6:03am

Supposes that the batch size is B, each image has N rectangular proposals, thus exits a tensor proposals of shape = [B, N, 4], with the format of xyxy. Each proposal has scores for C classes, corresponding to a tensor scores of shape = [B, N, C]. I want to get a mask of shape = [B, C, H, W] from proposals and shapes. And each pixel value of mask is the accumulation of all proposals that cover this positio:
Screenshot from 2020-06-04 14-00-02
s is the mask, and D is proposal, k means class k.
The only way I can figured out is using loop as below:

scores = torch.randint(10, [2, 2, 20]) / 10.0
batch_size, proposal_number, num_class = scores.shape
height, width = 50, 50

proposals = torch.LongTensor([[[0, 0, 20, 20], [10, 12, 28, 40]], [[5, 9, 20, 40], [6, 20, 41, 38]]])  # [B, N, 4]

masks = []
for i in range(batch_size):
    mask = torch.zeros(num_class, height, width)
    for j in range(proposal_number):
        proposal = proposals[i, j].to(torch.int)
        mask[:, proposal[1]: proposal[3], proposal[0]: proposal[2]] += scores[i, j, :].view(-1, 1, 1)
    masks.append(mask)
masks = torch.stack(masks, dim=0)

But when N is large, this method is very slow. So I wonder if there exits faster way to do this?