Mask RCNN for Production

I have trained Mask RCNN from pre-trained resnet50_fpn. I want to use the saved weights in a production environment where i store the savedmodel.pt on a server and i make send data to it for inference. All goes well until 3 or more requests are made to the model. If more than 3 requests are made at a same time the inference time goes from 2s/image to 15 seconds. Is this normal? Is there any way to improve this?
PS The behaviour is the same on CPU or GPU