Running backward in parallel for multiple grad_tensors

Hi everyone,

I’m trying to run the backward pass of an output multiple times for different grad_tensors and store the individual resulting grads of my input variable like so:

import numpy as np
import torch
from torch.autograd import Variable

x = Variable(torch.randn(10), requires_grad=True)
y = x*x

n_back = 5
grad_tensors = [torch.randn(10) for _ in range(n_back)]

grads = np.empty([10,n_back])
for i in range(n_back):
    y.backward(grad_tensors[i],retain_graph = True)
    grads[:,i] = x.grad
    x.grad.data.zero_()

print(grads)

Is there any way to make this more efficient, e.g. execute this in parallel in the graph (instead of just parallelizing the for loop)

Any help would be greatly appreciated!

EDIT: Clarified the question a bit.