Creating bigram embeddings by concatenating adjacent word embeddings

Hi,
I am trying to create bigram vectors by concatenating adjacent word vectors along the sequence dimension. I have gotten that to work, but my current implementation (see toy example below) mixes pytorch tensor operations with python list comprehensions, which is likely suboptimal and inefficient. Is there a better “pure pytorch” implementation of this, or am I just optimizing prematurely? :slight_smile:
Thanks!

In [22]: t
Out[22]:
tensor([[1.1000, 1.2000, 1.3000],
[2.1000, 2.2000, 2.3000],
[3.1000, 3.2000, 3.3000]])

In [23]: torch.stack(list(torch.cat((a, b)) for a, b in zip(t, t[1:])))
Out[23]:
tensor([[1.1000, 1.2000, 1.3000, 2.1000, 2.2000, 2.3000],
[2.1000, 2.2000, 2.3000, 3.1000, 3.2000, 3.3000]])

Every optimization if premature unless it’s the bottleneck :wink:

Depending on your use case, unfold might be faster but would also need to create a copy:

t = torch.tensor([[1.1000, 1.2000, 1.3000],
                  [2.1000, 2.2000, 2.3000],
                  [3.1000, 3.2000, 3.3000]])

out = torch.stack(list(torch.cat((a, b)) for a, b in zip(t, t[1:])))

size = 2
stride =1
ret = t.unfold(0, size, stride).permute(2, 0, 1).contiguous().view(-1, t.size(1)*size)

print((out == ret).all())
# tensor(True)