Tips for increasing Inference FPS

For background I am currently utilizing a version of this project GitHub - jwyang/faster-rcnn.pytorch at pytorch-1.0

basically Faster R CNN implementation to run against a webcam , or set of jpg on disk…

My question is , any basic tips for increasing the FPS when running my demo mode

I’m wondering if I should come up with a buffering strategy… try to compile the code with pytorch from source? switch to ONNX and load the model in C++ version of pytorch or some other inference focused runtime?

I’ve customized a stream which sends frames over a network socket… and thinking about switching to a compression technique to improve FPS (I currently am getting about 5 FPS on a 600x700px image of type JPG)… but really new to this space or how to speed things up

I’m already running latest PyTorch 1.10 and CUDA supported by my machine 11.3

RTX 3090 + Ryzen 9 5950x with 64 GB of ECC 3200mhz RAM so I cannot really give it much more on the hardware side