How to avoid reloading Pytorch ML model in Flask REST API for every sequential resquests

Hello there, I have a build a Flask Rest API to classify the images using RESNET50 in PyTorch, my problem is, model and weights are loaded for each request, is there any way to load only once for the overall life cycle of the API.

I assume your Flask application would have some kind of initialization method, where you could setup your model and just create the prediction for each request.
Based on your description it seems that you are recreating the model for each request, is that correct?