What's the best way to concatenate these pytorch dimensions?

I have a tensor P, with dimension: (batch-size x num-layers x length x embedding-size)
I want to concatenate the embeddings across all layers, so eventually, I want a tensor with the following dimensions:
(batch-size x length x num-layers*embedding-size)

Let’s take an example:

P = torch.randn(10,3,105,1024)
where batch-size = 10, num-layers = 3, length-of-sentence=105, embedding-size=1024.

I want to concatenate the embeddings of 3 layers for each time-stamp in the sentence.

One way I can do this is:

batch_size = 10
concats = []
for idx in range(batch_size):
    concats.append(torch.cat([P[idx][0], P[idx][1], P[idx][2]], dim=1)[None, :, :])
Q = torch.cat(concats, dim=0)

Q’s dimension : (10,105,3072)

Note that, R = P.view(10, 105, -1) also gives me a tensor with similar dimensions as that of Q, but it will be a different tensor than Q, as it is concatenating the first layer for time-stamp 1,2,3 etc.

Is there any faster memory-efficient way of getting the Q tensor?

You can use .permute to swap axes and then apply .view to merge the last two dimensions.

>>> d = torch.randn(10, 3, 105, 1024)
>>> d.shape
torch.Size([10, 3, 105, 1024])
>>> d = d.permute(0, 2, 1, 3)
>>> d.shape
torch.Size([10, 105, 3, 1024])
>>> d = d.contiguous().view(10, 105, -1)
>>> d.shape
torch.Size([10, 105, 3072])
4 Likes