Loop Optimization in Pytorch

Hi, everyone! I have an array [a0, a1, …, an] and now I want to create a new array [b0, b1, …, bn] where

b0 = a0
b1 = a0 + a1
b2 = a0 + a1 + a2

bn = a0 + a1 + a2 + … + an

And the code I have written is

roll = len(A)
B = torch.zeros(2 * roll, device=A.device, dtype=A.dtype)
for i in range(0, roll):
    B[i:i + roll] += A
B = B[:roll]

which still costs lots of time. Any idea for further optimizing?

How about

B=A.clone()
for i in range(1, len(A)):
    B[i]+=B[i-1]

Actually, no need to use python loop, torch has a built in function for that torch.cumsum:

B = torch.cumsum(A, dim=0)

Thank you so much! Fixed my problem.

1 Like