Hi everyone,

I’m trying to run the backward pass of an output multiple times for different grad_tensors and store the individual resulting grads of my input variable like so:

```
import numpy as np
import torch
from torch.autograd import Variable
x = Variable(torch.randn(10), requires_grad=True)
y = x*x
n_back = 5
grad_tensors = [torch.randn(10) for _ in range(n_back)]
grads = np.empty([10,n_back])
for i in range(n_back):
y.backward(grad_tensors[i],retain_graph = True)
grads[:,i] = x.grad
x.grad.data.zero_()
print(grads)
```

Is there any way to make this more efficient, e.g. execute this in parallel in the graph (instead of just parallelizing the for loop)

Any help would be greatly appreciated!

EDIT: Clarified the question a bit.