Regarding this tradeoff, do you save time/memory by using `retain_graph=True`

in that situation ?

For example, my current code looks like this:

```
x = tensor(x0,requires_grad=True)
loss = 0
for i in range(inputs.numel()): # For my apps, it's between 5 and 50.
rec = f(x,i)
loss += loss_func(inputs[i], rec)
loss.backward()
g = x.grad
```

My current problem is that the computational graph takes too much memory because the function `f`

does a lot of computation. So a solution would be to do as @albanD suggested:

```
x = tensor(x0, requires_grad=True)
loss = 0
for i in range(inputs.numel()):
rec = f(x,i)
loss += loss_func(inputs[i], rec)
loss.backward()
g = x.grad
```

But I feel like the computationnal graph for each iteration of that loop is the same, it’s just the numbers on which we apply it that change. So maybe we could reuse the previous iteration’s graph (by specifying `retain_graph=True`

), could that save some time ? If not, what would happen (in terms of time/memory loss/gain) ?

```
x = tensor(x0, requires_grad=True, retain_graph=True)
loss = 0
for i in range(inputs.numel()):
rec = f(x,i)
loss += loss_func(inputs[i], rec)
loss.backward()
g = x.grad
```