Does the backprop speed depends on how to assign values to the tensor?

Hi,
Recently I found that how I assign a value to a tensor affects the speed of backward()
my test code is as follows
Simply, I assign ones to (1000,1000) with different methods: assigning by index, stack() after gathering
And then I calculated the sum of all values multiplied by 2

start = time.time()
from_empty = torch.empty(1000, 1000)
for i in range(1000):
    from_empty[i, :] = torch.ones(1000, requires_grad=True)
from_empty = 2*from_empty
from_empty = from_empty.sum()
from_empty.backward()
end = time.time()
print("assign by index", end-start)

start = time.time()
tmp = []
for i in range(1000):
    tmp.append(torch.ones(1000, requires_grad=True))
stacked = torch.stack(tmp, dim=1)
stacked = 2*stacked
stacked = stacked.sum()
stacked.backward()
end = time.time()
print("stack", end-start)

the result is

assign by index 1.5671675205230713
stack 0.08823513984680176

Why is the first one so slow?

Thanks

Hi,

You can try this package to see the generated graph to see why: torchviz (Change the size of the first dimension to 10 :wink: )

As you will seem one created the output in one operation, while the other creates it by doing 1000 inplace operations.

1 Like

Thanks for clear explanation :slight_smile:

1 Like