# Is there any easy way to compute a ``head-mask-specified" pooling computation?

For example, I have a head-mask `h = [1, 1, 0, 0, 1]`, where `1` denotes the head of a span.
And, I also have a tense `a = torch.tenser(5, 100)`. I want to implement a function, which should compute a new tensor `b` with a size of (3, 100), such that b[0] = a[0], b[1] = a[1] + a[2] + a[3] (as denoted in h), b[2] = a[4].
Using `for` computation is easy, but is there any efficient computation? What if `h` and `a` are batched?

Hi, ryancc

Hi, thank you very much.

Say I have a sentence consisting of two words: S = [â€śDefinitelyâ€ť, â€śnotâ€ť], and what I want is to transfer S into an embedding matrix T with a size of (2, 100), where each row represents a word.

I want to adopt BERT embeddings. But in BERT, each word is represented as a sub-word unit. This means that S will be represented as [â€śDefâ€ť, â€ś##inâ€ť, â€ś##iteâ€ť, â€ś##lyâ€ť, â€śnotâ€ť] ( â€śDefinitelyâ€ť is tokenized as â€śDefâ€ť, â€ś##inâ€ť, â€ś##iteâ€ť, â€ś##lyâ€ť). BERT will output an embedding matrix H with a size of (5, 100) :(.

My goal is to merge some rows of H according to the sub-word units.
For example, for â€śDefinitelyâ€ť, I should merge the embeddings of [â€śDefâ€ť, â€ś##inâ€ť, â€ś##iteâ€ť, â€ś##lyâ€ť] to get its presentation.

In my current method, I use a head mask vector h = [1, 0, 0, 0, 1] to record the â€śheadâ€ť of each word, where 1 indicates the head position:
h = [
1, -> â€śDefâ€ť
0, -> â€ś##inâ€ť
0, -> â€ś##iteâ€ť
0, -> â€ś##lyâ€ť
1 -> â€śnotâ€ť
]
So I should merge rows which have a head mask of 0 to that having a head mask of 1. I have to use the `for` computation to enumerate each element in `h` , which is slow and can not batchfy.

Could you give me some efficient method to do the above computation?

I have found an efficient way to deal with it.

``````import torch

N = 6
C = 4

data = torch.randn(N, C)

print('ori data')
print(data)
print('after spliting')
print(splits)

# Each item contains a word, using combine all the elements in each item in the list
# you get the words you want.
``````
1 Like

Hi, Naruto-Sasuke:
I have read your solution. It is exactly what I am looking for!
Thank you very much!