Is there any way to parallelize code over modules that don’t depend on each other?

In TF, independent sections of the graph are automatically executed in parallel if called together in one session.run(…) call. A simple case is this:

xs = [ ... ] # list of torch.Tensors
models = [ ... ]  # list of nn.Modules
out = [m(x) for m, x in zip(models, xu)]

It seems to me if this code is dynamically executed, it is not possible to parallelize the calls of m1(x1), m2(x2), … mn(xn) for some obvious performance benefits.

What is the best way in PyTorch to achieve this effect? Is it to use the jit?

The JIT execution won’t run anything in parallel, right now it mostly does code transformations and fusions to optimize performance.

If you have multiple GPUs, DataParallel might be helpful.