Separate deep networks has benefits over one big network with multiple outputs.
there is no correlation or interference between networks.
but the bottleneck is that you cannot process them in parallel if you don’t have several GPU.
e.g., xs = [net(x) for net in nets] makes it processing sequentially.
What I suggest is to add functionality to nn.ModuleList, so that, it has synchronous parallelism, as option or by default, e.g via torch.multiprocessing (if it is possible)
so just first code will turn into: xs = nets(x)