I am using pytorch to do stochastic gradient descent in order to maximaze a target function (evidence lower bound for a variational method). I am using CPU tensors
I have a model with parameters. In each iteration I sample a number of trees and corresponding branch lengths using the model’s parameters. For each tree t I apply funtion f that computes a probability over the tree to get f(t), and use them to compute the loss. The computations in f are totally independent for different trees.
My question: Is there any way to run the function f for differenct trees in parallell?
I tried following:
from torch.multiprocessing import Process, Pool
def logprior(trees, log_branch):
n_particles = len(trees)
pool = Pool(n_particles)
logp_trees = pool.map(f, zip(trees, log_branch))
But got
RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries. If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).
I understand the reason for this error. But I want gradient computation through f.
Thanks in advance.