I have a question about speed and performance in C++ and will appreciate that if anyone can answer my question! I’m doing instance segmentation with PyTorch + Cuda and the maximum Fps (frame per second) I can achieve is 12 Fps on full HD videos. I wonder if I convert my code to C++ with Cuda, What Fps should I expect? How many percentages will improve my Fps in C++ with Cuda?
It depends a lot on what you are doing, but if it’s CNN/Unet-Style (so no for loops over individual pixels), my guess would be a rather very small speedup (<10%, possibly much slower) from moving from Python to C++.
I’m doing instance segmentation with Yolact++ (https://github.com/dbolya/yolact) to get the mask of objects in real-time. I wanted to know if I convert the code too C++ with Cuda, what will be my Fps approximately? because right now I can only achieve to 12-14 Fps on 1080p video which I want to reach to 25Fps. Thanks!
So <10% speedup over 12-14fps would be 12-16fps.
Moving to C++ is not the efficiency gain you’re looking for.
You can import your trained model to C++ for the difference on Fps. I guess that it won’t increase as you expected. However, you can try ONNX inference tool for increasing the fps. Please let me know if it resolves your issue