What is the most common way to deply pytorch model?

I found that there is no third party for serving. Like tf serving server.

So I searched how pytorch users serve model and there are some ways like this

  1. pytoch model -> onnx -> tensorrt inference server
  2. pytorch model -> jit -> tensorrt inference server
  3. pytoch model -> load in python -> serve with flask etc.
    and so on

I wonder what is the common or preferred way to serving pytorch model.

Thre’s also onnx runtime which has a model server in beta.
And there’s an RFC open for a production serving framework for PyTorch.