Hi,

Recently I found that how I assign a value to a tensor affects the speed of backward()

my test code is as follows

Simply, I assign ones to (1000,1000) with different methods: assigning by index, stack() after gathering

And then I calculated the sum of all values multiplied by 2

```
start = time.time()
from_empty = torch.empty(1000, 1000)
for i in range(1000):
from_empty[i, :] = torch.ones(1000, requires_grad=True)
from_empty = 2*from_empty
from_empty = from_empty.sum()
from_empty.backward()
end = time.time()
print("assign by index", end-start)
start = time.time()
tmp = []
for i in range(1000):
tmp.append(torch.ones(1000, requires_grad=True))
stacked = torch.stack(tmp, dim=1)
stacked = 2*stacked
stacked = stacked.sum()
stacked.backward()
end = time.time()
print("stack", end-start)
```

the result is

```
assign by index 1.5671675205230713
stack 0.08823513984680176
```

Why is the first one so slow?

Thanks