Load the program only one time and later give it some images

Good night,
for my prediction in yolov9 instance segmentation I see that the model and the program load in 27 seconds then it takes 1.5 seconds per image. Would it be possible to load the program once and then give it images as you go and have it process them afterwards? I’m on Windows 10.
Please help me, it’s quite urgent.
Best regards

1 Like

Yes, serving solutions, such as NVIDIA/Triton Inference Server or torchserve might be helpful.

OK thank you but NVIDIA/Triton Inference Server works only with NVIDIA GPUs installed on the PC isn’t it and torchserve is for Windows Server, it is not guaranteed to run in Windows 10, and it is huge, other solutions please ?

could you give me a tutorial to install torchserve, provide it with the model and make yolov9 instance segmentation predictions on the fly please?