Torch.repeat and torch.expand which to use?

Hanxiong_Chen · October 24, 2018, 11:37pm

I have one vector and one 2-D vector. Say embed1= [1, 2, 3] shape is (3) and embed2=[[1, 2, 3], [4,5,6], [7,8,9]] shape is (3, 3). Now I want to expand embed1 to [[1, 2, 3], [1, 2, 3], [1, 2, 3]] so that I can do torch.cat() to concatenate embed1 with embed2 and send them to a MLP. Actually this is to avoid of doing for loop by concatenating embed1 with each vector in embed2 and then send each concatenated vector to MLP. When I read the document, it says that torch.repeat will copy data so I don’t know if this will affect the backpropagation since data are copied instead of the original one. I hope all the autograd process will only update my original embed1.

albanD · October 25, 2018, 9:10am

Hi,

Backprop-wise, they will give the exact same result.
The difference is that if the original dimension you want to expand is of size 1, you can use torch.expand() to do it without using extra memory.
If the dimension you want to expand is of size more than 1, then you actually want to repeat what is at that dimension and you should use torch.repeat(). It will use extra memory (there is no way around that).

tomahawk810 · October 26, 2018, 11:51am

Therefore, when expanding a dimension of size 1, should torch.expand() always be used instead of torch.repeat()?

albanD · October 26, 2018, 12:20pm

Yes.
Keep in mind though that if you plan on changing this expanded tensor inplace, you will need to use .clone() on it before so that it actually is a full tensor (with memory for each element). But even .expand().clone() should be faster than .repeat() I think.

tomahawk810 · October 26, 2018, 12:36pm

What about .contiguous instead of .clone ? Wouldn’t those 2 operations achieve the same goal before inplace modifications?

I might be going a bit off-topic, I am new to PyTorch and am trying to get what are best practices.

albanD · October 26, 2018, 12:38pm

Hi,

In this case yes they will do the same thing. Because the input is allways non contiguous

Hanxiong_Chen · October 29, 2018, 6:45pm

Thank you for your answer.

solsol · March 11, 2019, 11:36am

Hi,

A and B are feature maps, in shape of [12, 9, 64] each.
I have this code to concatenate them 9 times in different cells:
A = A.repeat(1, 1, 9).view(12, 81, 64)
B = B.repeat(1, 9, 1)

C = torch.cat((A, B), dim=2)

However, I faced with lack of memory to run this code. Is it possible to use expand instead of view, here also?
Thanks!

tom · March 11, 2019, 1:39pm

So to make this a self-contained example:

A = torch.randn([12, 9, 64])
B = torch.randn([12, 9, 64])
Ar = A.repeat(1, 1, 9).view(12, 81, 64)
Br = B.repeat(1, 9, 1)
C = torch.cat((Ar, Br), dim=2)
D = torch.cat([A.unsqueeze(2).expand(-1, -1, 9, -1), 
               B.unsqueeze(1).expand(-1, 9, -1, -1)], dim=-1).view(12, 81, 128)
print ((C-D).abs().max().item())

gives 0 and from A, B to D only the memory for D is newly allocated.

Best regards

Thomas

P.S.: Enclose multiple lines of source code with triple backticks ``` to have them rendered nicely. Single backticks for “in-text” commands.

solsol · March 12, 2019, 5:05am

Thomas thanks for your reply. It was helpful, however, still there is memory problem!!
“cuda out of memory”

pkmandke · May 1, 2020, 6:13pm

Hi.

I have a somewhat related query so thought I’d ask here instead of starting a new thread.

So, I am trying to use VGG as a feature extractor. My model returns a (grayscale) image tensor of size (1, 512, 512) that I want to feed into VGG which takes in 3 channel RGB images. The idea is to compute a loss based on the output of VGG and backprop that into my model.

So, if I repeat the grayscale across the 3 dimensions before feeding into VGG, will .backward() successfully backprop into my model? Some code snippets would be quite helpful.

Thanks.

albanD · May 1, 2020, 6:32pm

Hi,

Ye, both repeat and expand are differentiable. So you can use them in your net as any other op.

WolfLo · May 19, 2022, 2:32pm

Hi, in the following example, is Ar2 a better alternative to Ar, or does the reshape() function also allocates new memory, offsetting the advantage of expand() over repeat() ?

A = torch.randn([2, 3])
Ar = A.repeat(3,1)
Ar2 = A.unsqueeze(0).expand(3,-1,-1).reshape(3*A.shape[0],-1)

Thanks in advance