Supposes that the batch size is B, each image has N rectangular proposals, thus exits a tensor proposals of `shape = [B, N, 4]`

, with the format of xyxy. Each proposal has scores for C classes, corresponding to a tensor scores of `shape = [B, N, C]`

. I want to get a mask of `shape = [B, C, H, W]`

from proposals and shapes. And each pixel value of mask is the accumulation of all proposals that cover this positio:

s is the mask, and D is proposal, k means class k.

The only way I can figured out is using loop as below:

```
scores = torch.randint(10, [2, 2, 20]) / 10.0
batch_size, proposal_number, num_class = scores.shape
height, width = 50, 50
proposals = torch.LongTensor([[[0, 0, 20, 20], [10, 12, 28, 40]], [[5, 9, 20, 40], [6, 20, 41, 38]]]) # [B, N, 4]
masks = []
for i in range(batch_size):
mask = torch.zeros(num_class, height, width)
for j in range(proposal_number):
proposal = proposals[i, j].to(torch.int)
mask[:, proposal[1]: proposal[3], proposal[0]: proposal[2]] += scores[i, j, :].view(-1, 1, 1)
masks.append(mask)
masks = torch.stack(masks, dim=0)
```

But when N is large, this method is very slow. So I wonder if there exits faster way to do this?