Torch.repeat and torch.expand which to use?

I have one vector and one 2-D vector. Say embed1= [1, 2, 3] shape is (3) and embed2=[[1, 2, 3], [4,5,6], [7,8,9]] shape is (3, 3). Now I want to expand embed1 to [[1, 2, 3], [1, 2, 3], [1, 2, 3]] so that I can do to concatenate embed1 with embed2 and send them to a MLP. Actually this is to avoid of doing for loop by concatenating embed1 with each vector in embed2 and then send each concatenated vector to MLP. When I read the document, it says that torch.repeat will copy data so I don’t know if this will affect the backpropagation since data are copied instead of the original one. I hope all the autograd process will only update my original embed1.



Backprop-wise, they will give the exact same result.
The difference is that if the original dimension you want to expand is of size 1, you can use torch.expand() to do it without using extra memory.
If the dimension you want to expand is of size more than 1, then you actually want to repeat what is at that dimension and you should use torch.repeat(). It will use extra memory (there is no way around that).


Therefore, when expanding a dimension of size 1, should torch.expand() always be used instead of torch.repeat()?

1 Like

Keep in mind though that if you plan on changing this expanded tensor inplace, you will need to use .clone() on it before so that it actually is a full tensor (with memory for each element). But even .expand().clone() should be faster than .repeat() I think.


What about .contiguous instead of .clone ? Wouldn’t those 2 operations achieve the same goal before inplace modifications?

I might be going a bit off-topic, I am new to PyTorch and am trying to get what are best practices.


In this case yes they will do the same thing. Because the input is allways non contiguous :slight_smile:


Thank you for your answer.


A and B are feature maps, in shape of [12, 9, 64] each.
I have this code to concatenate them 9 times in different cells:
A = A.repeat(1, 1, 9).view(12, 81, 64)
B = B.repeat(1, 9, 1)

C =, B), dim=2)

However, I faced with lack of memory to run this code. Is it possible to use expand instead of view, here also?

So to make this a self-contained example:

A = torch.randn([12, 9, 64])
B = torch.randn([12, 9, 64])
Ar = A.repeat(1, 1, 9).view(12, 81, 64)
Br = B.repeat(1, 9, 1)
C =, Br), dim=2)
D =[A.unsqueeze(2).expand(-1, -1, 9, -1), 
               B.unsqueeze(1).expand(-1, 9, -1, -1)], dim=-1).view(12, 81, 128)
print ((C-D).abs().max().item())

gives 0 and from A, B to D only the memory for D is newly allocated.

Best regards


P.S.: Enclose multiple lines of source code with triple backticks ``` to have them rendered nicely. Single backticks for “in-text” commands.


Thomas thanks for your reply. It was helpful, however, still there is memory problem!!
“cuda out of memory”


I have a somewhat related query so thought I’d ask here instead of starting a new thread.

So, I am trying to use VGG as a feature extractor. My model returns a (grayscale) image tensor of size (1, 512, 512) that I want to feed into VGG which takes in 3 channel RGB images. The idea is to compute a loss based on the output of VGG and backprop that into my model.

So, if I repeat the grayscale across the 3 dimensions before feeding into VGG, will .backward() successfully backprop into my model? Some code snippets would be quite helpful.



Ye, both repeat and expand are differentiable. So you can use them in your net as any other op.

1 Like

Hi, in the following example, is Ar2 a better alternative to Ar, or does the reshape() function also allocates new memory, offsetting the advantage of expand() over repeat() ?

A = torch.randn([2, 3])
Ar = A.repeat(3,1)
Ar2 = A.unsqueeze(0).expand(3,-1,-1).reshape(3*A.shape[0],-1)

Thanks in advance