Speed up ensemble model by parallelizing the blocks

Mamdouh_Aljoud · January 9, 2021, 3:56pm

I am trying to speed up an algorithm responsible for producing 3d skeleton joints from 2D images. The algorithm (GAST-NET) consists of 4 main blocks running sequentially for every frame. I’m trying to parallelize the 4 blocks. I have some questions regarding the process.

Will parallelization help speed up the algorithm? I am trying to parallelize on one GPU only.
What other ways can I look into that can help with speeding up the algorithm?
Slightly related question, Isn’t PyTorch already trying to use maximum GPU resources to produce output as fast as possible? I monitored the GPU utilization and it was between 38 and 50%. Is there a way I can ensure that the GPU is used to the fullest?

pritamdamania87 · January 13, 2021, 12:21am

This really depends on the algorithm.

If you could share some sample PyTorch code for the algorithm, I can look into it to see if there are some optimization opportunities.