I am trying to do something like this (simplified version of my code):

```
for x in range(1,1000):
output = model(data)
#Change Data
loss = loss + F.nll_loss(output, target)
# Calculate gradients of model in backward pass
loss.backward()
# Collect gradients
final_result = final_result + myvar.grad.data
```

The problem is that a significant number of temporary variables are causing me to run out of GPU memory. Hence, is this next piece of code logically equivalent?

```
for x in range(1,1000):
output = model(data)
loss = F.nll_loss(output, target)
#Change data
# Calculate gradients of model in backward pass
loss.backward(retain_graph=True)
# Collect gradients
final_result = final_result + myvar.grad.data
del loss
del other_variables
```

If I am understand how .backward and .grad.data works correctly, then it should be equivalent. However, this is not the case for me and I’m currently looking for the bug.