In TF, independent sections of the graph are automatically executed in parallel if called together in one session.run(…) call. A simple case is this:
xs = [ ... ] # list of torch.Tensors
models = [ ... ] # list of nn.Modules
out = [m(x) for m, x in zip(models, xu)]
It seems to me if this code is dynamically executed, it is not possible to parallelize the calls of m1(x1), m2(x2), … mn(xn) for some obvious performance benefits.
What is the best way in PyTorch to achieve this effect? Is it to use the jit?