Compare Deployment Types To Deploy A Model


After training a model, I want to deploy this model in production. But there are many kinds of types such as TorchScript, ONNX, and TorchServe.
I read about them but it is quite difficult to understand all of them.
So could you explain them easily and compare them? Which one is the best for production?

ps: I tested inference normally by a pytorch module and a torchscipt model in the python environment. The inference time of the pytorch module is faster. It’s quite hard to understand.

Help me, thanks

Does anybody help me? Please.

PyTorch is an eager mode framework which means it runs python code line by line, this is amazing if your’e debugging your model but if you want optimal performance its less than ideal because of python overhead. Torchscript is subset of the python language that allows you to run PyTorch models without needing Python on your machine. Torchscript also acts as a good intermediate language that other optimization runtimes like ONNX can leverage to run your models faster.

Torchserve is a model serving tool which lets you deploy a PyTorch model regardless of whether its an eager mode pytorch model or torchscript or onnx.