Parallellism compatible with autograd?

I am using pytorch to do stochastic gradient descent in order to maximaze a target function (evidence lower bound for a variational method). I am using CPU tensors

I have a model with parameters. In each iteration I sample a number of trees and corresponding branch lengths using the model’s parameters. For each tree t I apply funtion f that computes a probability over the tree to get f(t), and use them to compute the loss. The computations in f are totally independent for different trees.

My question: Is there any way to run the function f for differenct trees in parallell?

I tried following:

from torch.multiprocessing import Process, Pool
def logprior(trees, log_branch):
    n_particles = len(trees)
    pool = Pool(n_particles)
    logp_trees = pool.map(f, zip(trees, log_branch))

But got

RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries. If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).

I understand the reason for this error. But I want gradient computation through f.

Thanks in advance.