When I perform deep learning inference in PyTorch, some models have branch structures in their architecture. In this case, does PyTorch consider dispatching the operators on these branches to different streams for parallel execution? Based on my testing results, it seems that there is no such mechanism. So I want to know why this is the case, or how I can achieve this parallel inference.
Thank you so much!
2 Likes
AFAIK Python doesn’t have a mechanism for branch prediction because in python and pytorch code is executed eagerly as in line by line
So to make code run faster you can use a jit like torch.jit.script
or torch.compile
which will specialize on or another branch and there’s some active discussions about explicitly adding control flow ops in torch Add support for dynamic control flow in torch.fx · Issue #99598 · pytorch/pytorch · GitHub