Hi Everyone,
Can anyone please recommend to me what is an efficient way to reduce the inference timing? Currently, my model takes 50sec to do the inference.
Objective: I have built the flask app and it takes 50sec to do the inference. So, I would like to reduce the inference timing and return the result from the flask app.
Steps were already taken:
- attached model to GPU
Note:
Framework: PyTorch
Application: Object detection
It would be really helpful if anyone recommends an efficient way to optimize.