How to improve inference timing?

iamexperimentingnow · June 10, 2022, 9:10am

Hi Everyone,

Can anyone please recommend to me what is an efficient way to reduce the inference timing? Currently, my model takes 50sec to do the inference.

Objective: I have built the flask app and it takes 50sec to do the inference. So, I would like to reduce the inference timing and return the result from the flask app.

Steps were already taken:

attached model to GPU

Note:
Framework: PyTorch
Application: Object detection

It would be really helpful if anyone recommends an efficient way to optimize.

Karthik_Ganesan · June 10, 2022, 4:41pm

There’s very little info here that anyone can help you with. What are you system configs? What GPU are you using? Is it for batched inference? or for a single input at a time? What is the flask app?