Let’s say I have an input x that is a variable and I have a double loop and dummy variable that sums the x Variable on the inner loop and gets reset after the second loop finishes. For example

x = Variable( …)

for i in range(0,10):`k = Variable( size(x) ).fill(0) for j in range(0,i): k = k + x store[i] = k`

Or

k = Variable( size(x) ).fill(0)

for i in range(0,10):`k[:] = 0 for j in range(0,i): k = k + x store[i] = k`

Which one produces the correct behavior or is preffered for backpropagation and why? The first case seems to create much more many variables, but is just storing the result enough to keep the gradients alive as in second case?

Thanks