What is the best way to inference a model multiple times at a time? (using only CPU)

I have 1 server with 16 cores. At the same time I can receive a maximum of 8 requests to inference the model. What method should I use to optimize between memory, cpu usage and speed? Can you give me a few suggestions?
Thank you very much.
P / s: My current method is to use multiprocess but it seems to be ineffective. Because the model is loaded up to 8 times at a time