Expand vs Repeat: Semantic Difference?

Rafael_R · November 1, 2019, 3:39pm

Hi,

How do the below two ways of extending a batch_size x feature_size tensor differ?

visual_feature = visual_feat.expand([batch_size,batch_size,self.feature_size])
text_feature = sentence_embed.repeat(1,1,batch_size).view(batch_size,batch_size,self.feature_size)

albanD · November 1, 2019, 4:58pm

Hi,

expand() will never allocate new memory. And so require the expanded dimension to be of size 1.
repeat() will always allocate new memory and the repeated dimension can be of any size.

aurooj · February 18, 2020, 12:10am

Hi, regarding the same question, will it have any impact on results if we choose expand() over repeat()?

I am trying to learn pairwise relation within a list of vector (say Nxd). I want it to be learnable so I expand Nxd tensor to NxNxd and concat it to itself such as every vector is concatenated to every other vector resulting in NxNx2d tensor. I could use repeat() or expand() to obtain such tensor.

Keeping memory limitations in my mind, I used expand() but when I learn my model, I see the scores are same in every row where I was expecting to be diverse.

Do you think it could be because expand() does not allocate new memory?

albanD · February 18, 2020, 4:54pm

Hi,

I didn’t understand you description fully, but it is easy to check: just replace it with repeat and see if you have the same behavior?

aurooj · February 18, 2020, 7:31pm

Hi, Thanks for your reply!

Let me be more clear in explaining what I am trying to do here:
For an Nxd tensor, I want to learn self-attention scores for each possible pair.
I construct an NxNx2d tensor by doing the following:

    def tile_concat(self, in1, in2):
        assert (in1.shape == in2.shape)
        b, n, d = in1.shape
        t1 = in1.unsqueeze(2).repeat(1, 1, n, 1)
        t2 = in2.unsqueeze(1).repeat(1, n, 1, 1)
        out = torch.cat([t1, t2], -1)
        return out

where in1 and in2 are two different projections of same Nxd tensor.

I used a fully connected layer to map NxNx2d tensor to NxNx1, followed by a row-wise softmax to get probability scores.

Which are then used to obtain weighted encoding of Nxd size.

I hope this gives a more clear picture.
Learning NxN matrix gives me same scores in each row. I have replaced torch.expand() with torch.repeat() now and training my model again.

But I doubt that it should cause any issue like I am having. Can you point to any other problem which may cause such a behavior?
Thansk in advance!